| 


——Ó— 


"i 
^ $ . 
D 
' 
- " "1 
— ————— RSEN 
s * 
—À 
— 


s. 


Y 


"s. 
AP, 
m 


. 
d 


BIOMETRIKA 


'  BIOMETRIKA 


FOUNDED BY 


W. F. R. WELDON, FRANCIS GALTON anp KARL PEARSON 


MANAGING EDITOR 


E. S. PEARSON e $B Pw 
35 65] 
ASSOCIATE EDITORS | iem: 2 
"Y M. G. KENDALL JOHN WISHART 
in consultation with 

HARALD CRAMER J. B. S. HALDANE 

F. N. DAVID H. O. HARTLEY 

R. C. GEARY D. G. KENDALL 


VOLUME 43 


1956 


ISSUED BY Su 54) = 
THE BIOMETRIKA OFFICE, UNIVERSITY COLLEGE, LONDON 
PRINTED AT THE UNIVERSITY PRESS, CAMBRIDGE 


BA ^ - 5d 


CONTENTS OF VOLUME 43 
Memoirs and Miscellanea 


Axis, A. A. On the moments of the maximum of partial sums of a finite number of 
independent normal variates . z : 
ANSCONMBE, F. J. On estimating binomial response relations 
Baner, N. T. J. On estimating the latent and infectious periods of measles: 
I. Families with two susceptibles only 
IL Families with three or more susceptibles 
Bamir, N. T. J. Significance tests for a variable chance of infection in chain- 
binomial theory P 
BARTHOLOMEW, D. J. A sequential test of randomness for events occurring in time 
or space ; " : . ; " $ . . . 
Barton, D. E. A class of distributions for which the maximum-likelihood estimator 
is unbiased and of minimum variance for all sample sizes . . 
Barron, D. E. Addendum. The limiting distribution of Kamat's test statistic 
Barron, D. E. and Davin, F. N. Tests for randomness of points on a line 


BLANK, A. A. Existence and uniqueness of a uniformly most powerful randomized 
unbiased test for the binomial 


Buss, C. I., Cocuran, W. G. and Tukey, J. W. A rejection criterion based upon 
the range 


Brom, G. Corrigenda to ‘Transformations of the binomial negative binomial, 
Poisson and y? distributions’ 5 : A 

Boser, R. C. Paired comparison designs for testing concordance between judges . 

BroapBent, S. R. Examination of a quantum hypothesis based on a single set of 
data . š k ; z " s i . " 

Broapsent, S. R. Lognormal approximation to products and quotients 

Bucur, J. C. Treatment variances for experimental designs with serially corre- 
lated observations . : i 

Cocuran, W. G. (See Briss, C. I.) 

Cox, D. R. A note on the theory of quick tests 

Crow, E. L. Confidence intervals for a proportion . s 

Daxrgrs, H. E. The approximate distribution of serial correlation coefficients 

DARWIN, J. H. The behaviour of an estimator for a simple birth and death process 

Davin, F. N. (See Barron, D. E.) 

Davi, F. N. A note on Wilcoxon's and allied tests — . . ; à : , 

Davin, H. A. Onthe application to statistics ofan elementary theorem in probability 

Davin, H. A. Revised upper percentage points of the extreme studentized deviate 
from the sample mean 


PAGE 


449 


e Contents 
vi 


Dermay, C. Some asymptotic distribution theory for Markoy chains with a 
denumerable number of states 


Dovetas, J. B. Tables of Poisson power moments . : 
G J. Sufficiency conditions in regular Markov chains and certain random walks 
ANI,J. f 7 » j 
G J. Corrigenda to ‘Some theorems and sufficiency conditions for the maximum- 
ANI, J. ; c s im 
likelihood estimator of an unknown parameter in a simple Markov chain 


GIBERT, N. E. G. Likelihood function for capture-recapture samples . 


Goop, I. J. and Tovrurx, G. H. The number of new Species, and the increase in 
population coverage, when a sample is increased : 

Gvzsr, P. G. Grouping methods in the fitting of polynomials to unequally spaced 
observations š a A : 

Harane, J. B. S. and SMITH, SHEILA Mayn 
maximum.-likelihood estimate 


Hannan, E. J. (Sec Warson, G, S.) 


ARD. The sampling distribution of a 


Hanrzv, B. I. Some properties of an angul 
coefficient 


Hrary, M. J. R. Weighted probits allowing for a non-zero response in the controls 
HurrLv, N. A. The fitting of regression cur 


JAMES, G. S. (See Trickerr, W. H.) 
JAMES, G. 


ar transformation for the correlation 


ves with autocorrelated data 


S. On the accuracy of weighted means and ratios 
JENKINS, G. M, Tests 


of hypotheses in the linear auto-regressive model. II. Null 
distributions for 


higher order schemes: non-null distributior 

Kamar, A, R. A two-sample distribution-free test 

KENDALL, M. G. Studies in the history 
beginnings of a probability calculus 

Lawrzv, D. N. Tests of si 
tion matrices 

Lawrzy, D. N. A gener: 
hood ratio criteria s 


ns. 
of probability and statistics, II. The 
gnificance for the latent roots of covariance and correla- 


al method for approximating to the distribution of likeli- 
McFappzx, J. A. An a 
integral , 


Marrows, C. L. Note on the moment 
or both terminals are known . 


pproximation for the symmetric, quadrivariate normal 


-problem for unimodal distributions when one 


Mzpmu, J. A note on the risks 
Mirra, S. K. (See Roy, S. N.) 


Moonz, P. G. The estimation of the mean of a censored normal distribution by 
ordered variables, ; 3 


of error involved in the Sequential ratio test . 


PAGE 


285 


149 


96 


Contents 


QuENouILLE, M. H. Notes on bias in estimation . 


Rav, W. D. Sequential analysis applied to certain experimental designs in the 
analysis of variance 


RETERSØøL, O. A note on the signs of gross correlation coefficients and partial corre- 
lation coefficients 


Roy, S. N. and Mrrra, S. K. An introduction to some non-parametric generaliza- 
tions of analysis of variance and multivariate analysis 


Roy, S. N. and Sarwan, A. E. On inverting a class of patterned matrices 


Rovsrow, Erica. Studies in the history of probability and statistics. IIT. A note 
on the history of the graphical presentation of data 


Rusen, H. On the moments of the range and product moments of extreme order 
statisties in normal samples 


Rusen, H. On the sum of squares of normal scores 
SMITH, SHEILA MAYNARD. (See HALDANE, J. B. S.) 
STUART, A. Bounds for the variance of Kendall's rank correlation statistic . 


TAYLOR, J. Exact linear sequential tests for the mean of a normal distribution 
TourwrwN, G. H. (See Goon, I. J.) 


Trickerr, W. H., Wzrcn, B. L. and James, G. S. Further critical values for the 
two-means problem 


Tuoxzv, J. (See Brass, C. I.) 


WALKER, A. M. A goodness of fit test for spectral distribution functions of 
Stationary time series with normal residuals 


Watson, G. S. On the joint distribution of the circular serial correlation coefficients 
Warson, G. S. A note on the circular multivariate distribution f 
Warson, G. S. and Hannay, E. J. Serial correlation in regression analysis. II 


Warson, G. S. and Wirriaws, E. J. On the construction of significance tests on the 
circle and the sphere 


WELON, B. L. (See Trickerr, W. H.) 
Wnurrrrz, P. On the variation of yield variance with plot size 


Witrams, C. B. Studies in the history of probability and statistics. IV. A note on 
an early statistical study of literary style . : 


Wiu1ams, E, J. (See Warson, G. S.) 
Witttams, R. M. The variance of the mean of systematic samples 


Wiss, J. Stationarity conditions for stochastic processes of the autoregressive and 
moving-average type 


WISHART, J. X? probabilities for large numbers of degrees of freedom . 


Wooning, R. A. The multivariate distribution of complex normal variables 


st n J Contents 


Book Reviews 


ADAMS, J. K. Basic Statistical Concepts a è E " F. N. Davip 


BLACKWELL, D. and Grrsuicx, M. A. Theory of Games and Statistical Decisions 
, " , : 3 : 3 E à ? ; G. MoRTON 
BoorH, A. D. Numerical Methods P " è s ; F. G. FosrER 
Hg R. R. and MosrELLER, F. Stochastic Models for Learning 
} > P g . A. R. JONCKHEERE 


ELLINGER, A. G. The Art of Investment . à E 5 . A. STUART 
FEDERER, W. T. Experimental Design. Theory and Application 8. C. PEARCE 
FrxwEY, D. J. Experimental Design and its Statistical Basis S. C. PEARCE 
Haycocks, H. W. and Perks, W. Mortality and Other Investigations, Vol. 1 

: ; ; i ; ; i P. G. Moors 
HoanEN, L. Choice and Chance by Cardpack and Chesiboară, Vol. II 

, j " : ‘ . ? - s F. N. Davip 
KAPLAN, w. (Ed.) Lectures on Functions of a Complex Variable D. E. BARTON 
Korat, Z. Ni tuniérboal Analysis i P. G. MOORE 


“Lt, C. C. Population Genetics : " ‘ 3 A be C. A. B. SMITH 


Mus, F. C. Statistical Methods, 3rd Edition + F. Brown 


NATIONAL BUREAU OF STANDARDS. Publications of the U.S. Department of Com- 
merce, Applied Mathematics Series 32, 34, 41 , , À 


SNYDER, RICHARD M, Measurin 


i g Business NIME: a Handbook of Significant 
Business Indicators . i 


A. STUART 
SPIEGELMAN, M, J ntroduction to -— 


N. L. JOHNSON 
Sprowzs, R. C, Elementa 


ry Statistics for Shulenis of Social Science and Business 
: z : s | F. N. Davin 
; i 

» THRaLL, R. M., Coouzs, C. H. and Davis, R. L. Decision Processes 


G. A. BARNARD 
F. N. Davip 


gleich zweier 
D. E. BARTON 


TrePETT, L. H. C. ——— 2nd Edition 


WAERDEN, B. L. vAN DER and NrgvEsRGELT, E. Tafeln. zum Ver 
. Sichproben mittels X-Test und Zeichentest 


$a 


some new tables added. 


common use is now available. It c 


DIOMETRIKA, 43, | and 2 i Aces GIN 60 


BIOMETRIKA PUBLICATIONS: BOOKS OF TABLES 


Issued by the Cambridge University Press, Bentley House, London, N.W.1 
and obtainable from any bookseller 


Tables of the Incomplete B-Function 
EDITED BY KARL PEARSON 
59 pages of Introduction and 494 pages of Tables 


Price: 55s. net 


Tables of the Incomplete l-Function 
Epitep BY KARL PEARSON 
31 pages of Introduction and 164 pages of Tables 


Price: 425. net 


Tables of the Complete and Incomplete Elliptic Integrals 


(from LEGENDRE's Traité des Fonctions Elliptiques. With autographed portrait of LEGENDRE) 


39 pages of Introduction by KARL PEARSON and 94 pages of Tables 


Price: 125. 6d. net 


Tables of the Ordinates and Probability Integral of the 
Distribution of the Correlation Coefficient in Small Samples 
By F. N. DAVID 


38 pages of Introduction, 55 pages of Tables, ro Diagrams and 4 Charts 


Price: 175. 6d. net 


NEW PUBLICATION NOW AVAILABLE 
Biometrika Tables for Statisticians, Vol. I 


tricians are now out of print and will not be re- 


The two volumes of Tables for Statisticians and Biome 


issued, 
At the request of the Biometrika Trustees à complete recasting of these Tables has been undertaken 


by Professor E. S. PEARSON and Dr H. O. HARTLEY. Many of the old tables have been set aside or 
modified, tables which have been published during the last fifteen years in Biometrika are reproduced and 


which includes the statistical and auxiliary mathematical tables in more 
ontains an Introduction and 54 tables covering in all 238 pp. 


Volume I of the new series, 


-Prier 25s. net | 
; tureau Gdn 
pav He c 


|. Psy, Research | 
SAIS COLL CGE | 


$ 


NEW STATISTICAL TABLES: SEPARATES RE-ISSUED 
FROM BIOMETRIKA 


To be obtained from 


BIOMETRIKA OFFICE, UNIVERSITY COLLEGE, LONDON, W.C.1 


|. From Biometrika, Vols. 22, 27 and 28 
Tests of Normality. By E. S. PEARSON and R. C. GEARY 


Price Two Shillings and Sixpence, post free 


ll. From Biometrika, Vol. 32, Part 2, pp. 168-181 and 188-189 
(1) Table of percentage points of the incomplete beta- 


function 
(2) Table of percentage points of the y? distribution 


Stitched together with introductory matter. Price Two Shillings and Sixpence, post free 

lll. From Biometrika, Vol. 32, Parts 3 and 4, pp. 300-310 . 
(1) Table of the probability integral of the range in samples from a normal population 
(2) Table of the percentage points of the range 
(3) Table of the percentage points of the t-distribution 

Stitched together with introductory matter. Price Two Shillings and Sixpence, post free 


IV. From Biometrika, Vol. 33, Part 1, pp. 73-88 
Table of percentage points of the inverted beta (F) distribution 
With introductory matter, 


V. From Biometrika, Vol. 33 


(1) Table of the 
population 


Price Two Shillings and Sixpence, post free 


» Part 3, pp. 252-265 


probability integral of the mean deviation in samples from a normal 


(2) Table of the percentage points of the mean deviation 
Stitched together with introductory matter, Price Two Shillings and Sixpence, post free 
From Biometrika, Vol. 33, Part 4, Pp. 296-304 
Table for testing the homogeneity of a set of estimated variances 
With introductory matter, 


VII. From Biometrika, Vol. 35 
Table of significance 
tables. By D. J, FINNEY 


VI. 


Price Two Shillings, post free 


» Parts 1 and 2, pp. 145-156 
levels for the Fisher-Yates test of significance in 2x2 contingency 
With introductory matter. Price Two Shillings and Sixpence, 


VIII. From Biometrika, Vol. 35, Parts 1 and 2, pp. 191-201 
Table for the calcul 


ation of working probits and we 
and W. L. STEVENS 


post free 


ights in probit analysis. By D. J. FINNEY 


With introductory matter, 


Price Two Shillings and Sixpence, post free 
IX. From Biometrika, Vol. 36, Parts 3 and 4, Pp. 267-289 
Tables of autoregressive series, By M. G. KENDALL 
With introductory matter, Price Two Shillings and Sixpence, post free 


X. From Biometrika, Vol. 36, Parts 3 and 4, pp. 431—449 
Tables of symmetric functions, Part |, By F. N. DAVID and M. G. KENDALL 


With introductory matter, Price Two Shillings and Sixpence, post free 


NEW STATISTICAL TABLES: continued 


XII. From Biometrika, Vol. 37, pp. 168-172 and pp. 313-325 
i «c of the probability integral of the t-distribution 
able of the x? integral, and of th i i istributi 
EN tp gral, and of the cumulative Poisson distribution. By H. O. HARTLEY 


Stitched together with introductory matter. Price Five Shillings, post free 


us From Biometrika, Vol. 38, Parts 1 and 2, pp. 112-130 
arts of the power function for analysis of variance tests, derived fi th - 
F-distribution. By E. S. PEARSON and H. O. HARTLEY vint ene ae 


With introductory matter. Price Two Shillings and Sixpence, post free 


XIV.: From Biometrika, Vol. 38, Parts 3 and 4, pp. 435-462 
Tables of symmetric functions. Parts II and III. By F. N. DAVID and M. G. KENDALL 
With introductory matter. Price Four Shillings, post free 


ds From Biometrika, Vol. 38, Parts 3 end 4, pp. 423-426 
HARTE for the ineomplere beta-function and the cumulative binomial distribution. By H. O. 


With introductory matter and ruler scale. Price Two Shillings and Sixpence, post free 


XVI. From Biometrika, Vol. 40, Parts 1 and 2, pp. 70-73 


Tables of the angular transformation. By W. L. STEVENS 
With introductory matter. Price One Shilling, post free 


s From Biometrika, Vol. 40, Parts 1 and 2, pp. 74-86 
ests of significance in a 2x2 contingency table: extension of Finney's table (No. 


Computed by R. LATSCHA 
With introductory matter. Price Two Shillings and Sixpence, post free 


XVIII. From Biometrika, Vol. 40, Parts 3 and 4, pp. 427-446 


Tables of symmetric functions. Part IV. By F. N. DAVID and M. G. KENDALL 
With introductory matter. Price Four Shillings, post free 


XIX. From Biometrika, Vol. 41, Parts 1 and 2, pp. 253-260 


Tables of generalized k-statistics. By S. H. ABDEL-ATY 
With introductory matter. Price Two Shillings, post free 


XX. From Biometrika, Vol. 42, Parts 1 and 2, pp. 223-242 
Tables of symmetric functions. Part V. By F. N. DAVID and M. G. KENDALL 
With introductory matter. Price Four Shillings, post free 


XXI. From Biometrika, Vol. 42, Parts 3 and 4, pp. 494-511 
A new form of table for significance tests in a 2x2 contingency table. By P. ARMSEN 
With introductory matter. Price Two Shillings and Sixpence, post free 


No. XI is out of print. 


Vil). 


Biometrika Index 
ect Index for Volumes 1-37 and 
1-40 is now available 


A Biometrika Index comprising Subj 
Author Index for Volumes 


Price: 6s. net or $1.00 


To be obtained from 
SIOMETRIKA OFFICE, UNIVERSITY COLLEGE, LONDON, W.C.1 


(222 


BIOMETRIKA PUBLICATIONS 


Issued by the Cambridge University Press, Bentley House, London, N.W.1 
and obtainable from any bookseller 


The Life, Letters and Labours of Francis Galton, Vols. I, II, HIA, & Ils 
ByKARL PEARSON, F.R.S. Price £3. 3s. net 


Karl Pearson: An Appreciation of Some Aspects of his Life and Work 
By E. S. PEARSON Price 10s. 6d. net 


A Bibliography of the Statistical and Other Writings of Karl Pearson 
Compiled by G. M. MORANT, with the assistance of B. L. WELCH Price 6s. net 


"Student's? Collected Papers Edited by E. S. PEARSON and JOHN WISHART 
with a Foreword by LAUNCE McMULLEN Price 15s. net 


Karl Pearson's Early Statistical Papers 


Reprinted by photo-lithography for the Biometrika Trust, with the permission of the original publishers. The 
Volume contains eleven papers, including the more important of the memoirs entitled “Mathematical Contri- 
butions to the Theory of Evolution”, first published in the Philosophical Transactions of the Royal Society. The 


original paper deriving the x?-distribution, published in 1900 in the Philosophical Magazine, is also included. 


Price 25s. net 


ROYAL STATISTICAL SOCIETY 


SERIES A (GENERAL), VOL. 118, PART 4, 1955 
A Unified Derivation of some Well-known Frequency Distributions of Interest in Biometry and Statistics. By 
. O. IRWIN (with Discussion)—Colli , K. LipbELL—An Analysis 0f 162,332 Lottery Numbers. 
i S. aed of a Working Group— 
l d .E.G. Vital Statistics. By E. GREBENIK— 
International Statistical Institute, 29th Session, Ri Brazil, 1955. f il— 
Proceedings of the One Hundred and Twenty- eral Me to ee 


Or r first Annual General Meeting—Reviews of Books, Statistical and 
Current Notes, Additions to Library—Obituary: Philip Lyle, Stanley Jevons, George Findlay Shirras. 


Some Statistical Methods Connected with Series of Events. B. i 
Q / 1 - By D. R. Cox (with D 
Sube po pramming: -— ihe of Linear Programming. By S. Nain ee Minimizin 
near Inequalities. By E. M. L. BEALE—A Contributi f ing- 
- Morton and A. H. LAND (with Discussio: iral Content in pais ioe 


" — Stati ir 
—The Comparison of Means SP Sete ap Qu) atistical Concepts in their Relation toR 


uG; GowERr—Some Distribution 
5n y ) €r0-one Processes. By 
sed x Time in Bul 

Equalizing the Mean Waiting Times of Successive A Tice Queues, By F. D 


ROYAL STATISTICAL SOCIETY, 21 BENTINCK STREET, LoNDoN, W.1 


FLGA 


BIOMETRICS 


Journal of the Biometric Society 


VoL. 12 
12, No. 1 TABLE OF CONTENTS MARCH, 1956 


F > . . z H 

Hee ie for mixed series. Milton Morrtson—Block effects in the determination of 
in split-plot n DN. Ropert M. DEBAUN—Adjustment by covariance and consequent tests of significance 
Funes WO Wise JEANNE Titus TRUETT and H. FAIRFIELD SMITH—AÀ note on the combination of 
ja monte eet ce in multipleassays. PAMELA M. CrARKE— Missing and ‘mixed-up’ frequencies 
K.V. e es. G. S. WarsoN— Contributions to simultaneous confidence interval estimation. 
RB ed ccce ea genetic drift ina tri-allelic locus: exact solution with a continuous model. 
GUAR isting nd ultivariate analysis and agricultural experiments. D. J. FiNNEY— The relation between 
ay’ bet graded responses to drugs. P. S. HEWLETT and R. L. PLAckErr—One likelihood estimate 

y be inadequate. H. W. NORTON. 
JuNE, 1956 


Vor. 12, No. 2 
ation. R. C. Lewontin and TIMOTHY PRouT— 


mpanao Hf the number of different classes in a popul 
Bonot AH of age-specific infection rates from a curve of relative infection. P. WHITTLE—An evalua- 
hit SEE oa] method of estimating animal populations. CALVIN ZipPIN—Note on fitting the multi- 
genetics. C Ar JOAN M. GuRIAN—The concept of path coefficient and its impact on population 
GEORGE Goan 1— Confidence limits for genetic heritability. F. A. GRAYBILL, FRANK MARTIN and 
veneer. W Ste | evaluation by numerical and subjective methods with application to dried 
general QM. UMAN, J. W. GorrsrtiN and D. LANTICAN—Programming analysis of variance for 
purpose computers. H. O. HARTLEY. 
SS 
lows: For American Statistical Association Members, 
Statistical Association or the Biometric Society, 


a subscription rates to non-members are as fol 

HA ; for subscribers, non-members of either American 
-00. Subscriptions should be sent to the 
MANAGING EDITOR, BIOMETRICS 

NATIONAL RESEARCH COUNCIL, OTTAWA 2, CANADA 


TRABAJOS DE ESTADISTICA 


O DE INVESTIGACIONES ESTADÍSTICAS 


REVIEW PUBLISHED BY INSTITUT 
IONES CIENTÍFICAS 


OF THE CONSEJO SUPERIOR DE INVESTIGAC 
MADRID, SPAIN 


RUM CONTENTS Cuad. 11 

P BRENY— L'etat actuel du probléme de Behrens-Fisher. : . 

eh Moore—The mean successive difference in samples from an exponential population. 

TAS, 
J. Bran — Estudio de un problema de distribución de mineral. e 
Ci Azorin Pocu—Un diseño factorial aplicado al estudio de la productividad. 
RONICA, BIBLIOGRAFIA. CUESTIONES Y EJERCICIOS. 

eh Pd Cuad. HI 

de corpüsculos contenidosen un cuerpo a partir de la dis- 


uM nde = 
fae SANTALO— Sobre la distribución de los tamanos 
E ibución en sus secciones o proyecciones. r 
OCOPIO ZoROA—Un problema de mínimo planteado en arquitectura. - P 
E DE LA SALA—Aportación al estudio de un método ráp do de cálculo para la aplicación del método del 
N implex al problema del transporte de Hitchcock. 
OTAS. 
A. Diaz UNGRIA, A. CAMACHO Y S. Rios—Análisis discriminante de dos m 


ENRIQUE BL i 6 
ANCO— Conceptos operacionales de la inspección. UN " 
s RoMANI— La teoría d las din aplicada a un problema de producción industrial. 
ROCOPIO ZOROA—El análisis operacional en la propaganda. 
EJERCICIOS. 


Cn 
ONICA. BIBLIOGRAFIA. CUESTIONES Y 


uestras de indios venezolanos. 


For everything i ipti ite to Professor Sixto Rios, Insti 

verything in c i i anges and subscription wite * ios, Instituto de 

nvestigaciones Estadistia d with works, err an Investigaciones Científicas (Serrano, 123), Madrid, Spain. 

l1 Review is composed ofthe. fascicles published three times a year (about 350 pages), and its annual price is 
pesetas for Spain and South America an $4.00 U.S.A. for all other countries. 


~) 


Annals of Human Genetics 


Formerly ANNALS OF EUGENICS 
Edited by L. S. PENROSE 


Vol. 20, Pt. 3 CONTENTS February, 1956 
Hereditary deaf-mutism, with particular reference to Northern Ireland. E. A. CHEESEMAN and A. C. 
STEVENSON—Taste threshold for phenyl-thio-urea in Malay school children. V. ‘THAMBIPILLAI—Genetical 
investigations in a north-Swedish population. Population structure, spastic oligophrenia, deaf-mutism. 
J. A. BOOK—Prospects of biochemical genetics in medicine. J. A. BOOK and R. KOSTMANN— REVIEWS, 


Vol. 20, Pt. 4 May, 1956 
—A possible case of delayed mutation in man. 

CHARLOTTE AUERBACH-— Genetics of dermal ridges: parent-child correlations for total finger ridge-count. 
i i ight based on an Italian sample. M. FRACCARO— 


ors. J. H. BENNETT and C. B. V. WALKER— 
The estimation and significance of the logarithm of a ratio of frequencies. J. B. S. HALDANE—Hair- 
colour variation in the United Kingdom. E. SUNDERLAN al material from three 
regions, by N. A. BARNICOT—A ibuti i i 


The Editor regrets that owing to recent increases in the cost of production it is necessary to raise the 
subscription price to 65s. net per volume of four quarterly parts (in U.S.A. 811.00) post free. Single issues 
175. 6d. (in U.S.A. $3.00) postage extra. 


CAMBRIDGE UNIVERSITY PRESS 
BENTLEY HOUSE, 200 EUSTON ROAD, LONDON, N.W. 1 


The Annals of Mathematical Statistics 


The Official Journal of the Institute of Mathematical Statistics 
VOL. 27, NO. 2 


JUNE, 1956 


gem e c NUM. DOES Jr. and E. L, LEHMANN—On Regular Be 
Dwass—The Estimation of th 


On the Normal Ap 
test. MARTI 


a í c ion of the 

Mons. On the Disia K M e E the Simultaneous Analysis of Variance Test, K eee ial 

to the Kiefer-Wolfowitz Stochastic 

Occupancy Theory. Joun E. 
corem. D. 

— REPORT OF THE 


of Chung's Lemma 
FREUND and ARTHUR N. POZNER. A Vector Form of the Wald Wants on Restricted 


Wolfowitz-Hoeffding 


(vi) 


ECONOMETRICA 


JOURNAL OF THE ECONOMETRIC SOCIETY 


Contents of Vol. 24, No. 2, April 1956, include: 
AN hs KEMENY, OSKAR MORGENSTERN and G. L. THOMPSON. A Generalization of the von 
eumann Model of an Expanding Economy 
WERNER Z. HinscH. Firm Progress Ratios 
JoHN C. Harsanyl. Approaches to the Bargaining Problem Before and After the Theory of 
Games: A Critical Discussion of Zeuthen’s, Hicks’, and Nash's Theories 
R. Duncan Luce and E. W. ADAMS. The Determination of Subjective Characteristic Functions 
in Games with Misperceived Payoff Functions 
Henry TeICHER. Identification of a Certain Stochastic Structure 
R. Duncan Luce. Semiorders and a Theory of Utility Discrimination 
ee TIZARD. Note on Initial Conditions in the Solution of Linear Differential Equations with 
nstant Coefficients 
Book REVIEWS, NOTES AND ANNOUNCEMENTS 
Published Quarterly Subscription rate available on request 


The Econometric Society is an international society for the advancement of economic theory in its 


relation to statistics and mathematics. 
Subscriptions to Econometrica and inquiries about the work of the Society and the procedure 


1n applying for membership should be addressed to 
RICHARD RUGGLES, Secretary 
THE ECONOMETRIC SOCIETY, BOX 1264, YALE UNIVERSITY 
NEW HAVEN, CONNECTICUT, U.S.A. 


SANKHYA 


THE INDIAN JOURNAL OF STATISTICS 
EDITED BY P. C. MAHALANOBIS 


CONTENTS 
S. HALDANE 


a. By P. C. MAHALANOBIS 


Vol. 16, parts 1 & 2, 1955 
The Maximization of National Income. By J. B. 
The Approach of Operational Research to Planning in Indi 
Indian Statistical Institute, Twenty-third Annual Report ame 
Corrigenda. By G. KALLIANPUR and C. RADHAKRISHNA Rao 
CORRENT BACK NUMBERS 
per issue per valime per issue 


per volume 10 Rs. 4. Rs. 12/8 
INDIA Rs. 30 RD $15.00 $4.50, 


onetan $10.00 
ers for back numbers should be sent to 


ore Trunk Road, Calcutta -35 


SUBSCRIPTION 


Subscriptions and ord 


Statistical Publishing Society, 204/1 Barrackp 


(vii) 


AMERICAN STATISTICAL ASSOCIATION 
1108 16th St., N.W. Washington 6, D.C. 


JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION 


CONTENTS 
VOL. 51, NO. 273 MARCH 1956 


Editorial Collaborators 


Articles 


J. R. Bainbridge, Alison M. Grant and U. Radok. Tabular Analysis of Factorial 
Experiments and the use of Punch Cards 


Charles F. Carter and Mary Robson. A Test of the Accuracy of a Production Index 


W. Edwards Deming. On Simplifications of Sampling Design through Replication 
with Equal Probabilities and without Stages 


Howard L. Jones. Investigatin 


g the Properties of a Sample Mean by Employing 
Random Subsample Means 


Leslie H. Miller. Table of Percentage Points of Kolmogorov Statistics 
Lincoln E. Moses. Some Theoretical Aspects of the Lot Plot Sampling Inspection 
Plan 


Stanley Reiter. Estimates of B 
Normal Distributions 


George L. Edgett. Multi 
Independent Variables 


ounded Relative Error for the Ratio of Variances of 


Ple Regression with Missing Obervations among the 


Statistical Abstracts 
Book Reviews 


Publications Received 


THE AMERICAN STATISTICAL ASSOCIATION 
S ALL PERSONS INTERESTED IN: 
1. Development of new theory and method, 


2. Improvement of basic statistical data, 
A $ A 3. Application of statistical methods to practical Problems, 


INVITES As MEMBER: 


Vetunx 43, Parts l AND 2 June 1956 


STUDIES IN THE HISTORY OF PROBABILITY 
AND STATISTICS 
Il. THE BEGINNINGS OF A PROBABILITY CALCULUS 


By M. G. KENDALL 
Research Techniques Unit, London School of Economics and Political Science 


i pic i article in this series by Dr F. N. David (1955) has reviewed the development 

rini in gaming up to the time of Fermat and Pascal, who are popularly but 

By in end —— to have founded the calculus of probability. In this paper I shall 

who Gets he evolution of the idea of a probability calculus with especial reference to 
avid calls the tantalizing period prior to A.D. 1600. 

ighout Europe. At some unknown 


2. Duri : 
uring the dark ages gaming was prevalent throt 
ards were not 


E I dice finally ousted tali as instruments of play: and since cards wei 
nearly ir until about A.D. 1350 gaming must have been conducted mainly with dice for 
ion i etia, years. Efforts on the part of Church and State to control the evils 
of the = With it were as ineffectual then as they are today, and nothing is more indicative 
of St 7 aree i» gambling than the continual attempts made to prevent it. The sermon 
lian nw o of Carthage De Aleatoribus (c. A.D. 240) was echoed twelve hundred years 
A.D. 1423 en famous sermon of St Bernardino of Siena Contra Alearum Ludos of 
more Sani he gambling of the Germans referred to by Tacitus may perhaps have become 
a 232) i SE ate but was equally prevalent in the thirteenth century when we find Friedrich II 
in. ssuing a law de aleatoribus and Louis IX (1255) forbidding not only the play but 
(eg ^" he manufacture of dice. A long series of edicts prohibiting the clergy from gaming 
of W Y Otto der Grosse, A.D. 952, the Councils of Tréves in A.D. 1227 and 1238, the Council 

Orcester in A. p. 1240) are themselves eloquent of the failure on the part of the authorities 


to r 
© repress the evil. 
3. 
We must remember, however, that 


diy, 

e . 

^h Cted against games of chance as such but ag 
er woe Pane , ; ; 3 iz ifo 

Te seems to have been nothing impious m creating a chance event or In using it for 


oe of amusement. The Church was much more concerned about the drinking and 
aring which accompanied gaming ; and the State was more concerned about the idleness, 
tiftlessness and crime which were so often found among gamblers. Chaucer's Pardoner 
Puts the official view of his day by giving an example of the blasphemy which usually 


: te 
“Companied a gambling game (almost certainly of hazard)* 

By Goddés precious heart and by his nails 
And by the blood of Christ that 1s m Hayles 
Seven is my chance, and thine is cing and trey. 
By Goddés armés, if thou falsely play — 
'This dagger shall throughout thine herté go!— 
This fruit cometh of the bitchéd bonés two: 


E Forswearing, irá, falseness, homicide. . n 

fo Ven chess, the most innocent of all games, was classed among the major vices, at least 

Officials. The interdict of Louis LX referred to above says: They shall abstain. . .from 
Her cu í "ihe Engli tions to some extent where 

the €re and elsewhere I have modernized the spelling of the English quotati where 


metre permits. 
1 Biom. 43 


all these banns and prohibitions were not really 
ainst the vices which accompanied them. 


2 Studies in the history of probability and statistics. TI 


€ : Suspense aa di ie 
dice and chess, from fornication and frequenting taverns. Gaming-houses and the mar 
facture of dice are prohibited throughout the realm.’* 


4. San Bernardino enumerates at length fifteen ‘malignitates impiisimi ludi’ but they 
are all moral evils (love of gain, idleness, corruption of youth, etc.) with the exception of 
blasphemy and contempt of the prohibitions of the Church. One feels that if there had 
been anything to say about the impiety of eliciting chance events for innocent entertain- 
ment the saint would certainly have said it. The general attitude of his time seems to have 
been one of toleration of the actual play but stern opposition to its associated vices. There 
is some positive evidence to the same effect. About A.D. 960 à certain bishop Wibold of 
Cambray invented a clerical version of dice to which I shall refer later; this sagacious 
realist evidently recognized the impossibility of stamping out the evil and hence attempted 
to turn it into good. Participants in the third crusade (a.p. 1190) had, in their briefing 
instructions, à carefully drawn up statement of the extent to which they might gamble; 
no person below the rank of knight was permitted to play at all for money; knights and 
the clergy might play but could not lose more than twenty shillings in twenty-four hours. 
Chaucer, in the Franklin’s Tale, refers to the playing of chess and tables (backgammon) 


with the laudable object of distracting the heartbroken Dorigene. In 1484 Margery writes 
to John Paston (24 December): 


Please it you to wit that I sent your eldest son to my Lady Morley to have knowledge what sports 
were used in her house in Christmas next following after the decease of my lord her husband; and she 
i ne disguisings, nor harping, nor luting, nor singing nor none loud disputes; but 
and chess and cards; such disports she gave her folks leave to 

play and none other, 


5. We may also notice a series of laws, beginning in the reign of Edward III, prohibiting 
the playing of certain games in order to promote manly sports. An act of Henry VIII 
added dice and cards to the list of unlawful amusements, although Henry, like many other 
monarchs, set a, very bad example to his subjects. These laws were militaristic in origin. 
The common people were not to waste their leisure in playing peaceful games like bowls, 
ninepins, hockey and dice: their duty was to practice archery in readiness for the next war. 

6. I recall these 


times to the Renaissance without interruption 
s but among the middle classes and 


Sperating features of the many 
that authors invariably assume 


line of development, 


* 'Abstineant, ii 
etiam deciorum pro 


vel seacis, et a fornicati is. S 
Taxillus is a diminut. 


hibemus omnino.. -Fabrica vero deciorum 
ive of talus, but I do i 


M. G. KENDALL 3 


8. The Romans played with four tali (huckelbones) but with only three tesserae (dice). 
At some early stage versions of dice-playing with only two dice are mentioned ; for example, 
Bishop Eustathius, in a commentary on the Odyssey written in A.D. 1180, refers to games 
With two dice. The Chaucerian extract quoted above also mentioned two dice. In 1707 
Montmort wrote of ‘le quinquenove, le jeu de trois dez et le jeu du hazard. Les deux 
premiers sont les seules jeux de dez qui soient en usage en France, le dernier n'est commun 
qu'en Angleterre. Both quinquenove and hazard were played with two dice; all three 
Sames are variants of the same idea. 


9. Hazard, the game as distinct from its modern meaning of chance in general, was, 
I believe, brought back to Europe by the Third Crusaders. Godfrey of Bouillon gives 
à false derivation: *À Hazait [Hazar] s'en ala ung riche mandement, et l'apiel-on Hazait 
Pour le fait proprement que ly dés fu fais et poins premierment.’ There can be little doubt 
that the word derives from the Arabic al zhar, meaning a die.* Wherever it came from, the 
name and the game must have spread rapidly through Europe. Jean Bodel's play Le Jeu 
de Saint Nicolas, ascribed to the year A.D. 1200, refers to hazart. Salimbene (the son of 
a crusader), writing about 1287, refers to playing ‘ad azardum alias ad taxillum’. Dante's 
P urgatorio, written between A.D. 1302 and 1321, refers to azar; and Chaucer (about A.D. 1375) 
uses the word several times. The exact rules of play are not, so far as I am aware, on record. 
There were doubtless many variants. But the quotation from Chaucer given above suggests 
that the essential features of the modern game of craps were present at an early date; 
the addition of the numbers on two dice and the ‘chances’ of each player are clearly 
Indicated. + 
at during the several thousand years of dice playing 
Preceding, say, the year A.D. 1400, some idea of the permanence of statistical ratios and 
the rudiments of a frequency theory of probability would have appeared. I know of no 
evidence to suggest that this was so. Up to the fifteenth century we find few traces of 
a Probability calculus and, indeed, little to suggest the emergence of the idea that a calculus 
af dice-fa]ls was possible. It may be that gamblers had a rough idea of relative frequencies 
of Sccurrence—it is hard to see how they could fail to acquire such a thing; and as there is 
Some evidence of the manufacture of false dice from Roman times onwards wur = 
Presumably a complementary notion of fair throwing. It may also be that some intelligen 


ie worked out the elements of a theory for 
ts cash value. But I do not really believe this. Ot 
“ter, but not with permanent success. 


i 11. The earliest work I know which would wen 
n which dice can fall i i ted by W 

all is the game invented by 
aware no contemporary manuscript has survived but an account w ro Eae (à = 
Obscure one, incidentally) was given by the chronicler Baldericus in the eleventh century, 


the work being first published in 1615. Wibold enumerated 56 virtues—one corresponding 


* Libri i sivation: ‘Ce mot vient d'asar, qui en arabe 
i: A e. Abb p. 198) «50 ges sal obtaining two aces or two sixes. Libri 


rias l 8 : 
doped ee me beefed LE esperar mentioned pore se vao a ‘main’ (e.g. in 
one a at hazard is not a probabilistic i ee again and may (2) Saw yii Le 
Thich case the dice pass to his opponent; (b) win outright by E e E as a 
‘at i ther score which becomes his honos. He then goes xs ei 

ance’ turns up, losing in the first and winning in the seconc S 


10. It might have been supposed th 


m to have mentioned the number of ways 


ibold, referred to above. So far as I am 


4 Studies in the history of probability and statistics. II 


to each of the ways in which three dice can be thrown. irrespective of order. Apparently 
a monk threw a die three times. or threw three dice, and hence chose a virtue which he was 
to practice during the next 24 hours. It does not sound much of a game, but perhaps 
I have misunderstood Baldericus’s account. The important point is that the partitional 
falls of dice were correctly counted. There was no attempt at assessing relative probabilities. 


12. The use of dice for the purpose of choosing among a number of possibilities may well 
be much older than Wibold and certainly continued for long after his time. There exist 
several medieval poems in English, setting out the interpretations to be placed on the 
throws of three dice. The best-known is the Chaunce of the Dyse which is in rhyme royal; 
one verse for each of the 56 possible throws of three dice. For example, the throw 6, 5, 3 
gives Mercury that disposed eloquence 
Unto your birth so highly was incline 
That he gave you great part of science 
Passing all folkés heartés to undermine 
And other matters as well define 
Thus you govern your wordés in best wise 
That heart may think or any tongue suffise. 


Another incomplete poem in the Sloane manuscripts also deals with the throws syste- 
matically but in a different manner; e.g. for 6, 5, 3, 


Thou that has six, five and three 


Thy desire to thy purpose may brought be 
If desire be to thee y-thyght 


at are probably 
a certain interest in connexion with 
my present purposes the point to be noticed is that, for 


8, the different possible throws were enumerated and known 
without any reference to gaming or a probabilistic basis. 


erit minor, quam sit mi iginti 
literae comprehendantur, ac totidem puncta i i E OPUS MESE UE 


M. G. KENDALL 5 


editions of hi 
s is p : es — 
Itum di ad amr ye however, supposititious and several candidates have been 

: authorship. The one generally preferred is Richar i 5 
a gifted humanist of the Mi generally preferred is Richard de Fournival (1200-50), 
asistir mum she Middle Ages and Chancellor of the cathedral of Amiens. If this 
pissage deat em was presumably written between A.D. 1220 and 1250. It contains a long 

> YW e P + -— à 

sie te, ful ns ith sports and games, and with dicing in particular.* Itis, perhaps, worth 
thvowing din na i5 (if genuine) the first known calculation of the number of ways of 
iren ta dice. The text (taken from an edition published at Wolfenbüttel in 1662) 

ei an appendix (pp. 13-14 below). 

releva: as : 
E nt passage may be briefly and freely construed as follows: 
hree — RÀ Ü * — 

aia D onmes pos nre alike there are six possibilities; if two are alike and the other different there 
different es nn the pain can be chosen in six ways and the other in five; and if all three are 
56 possibilities 20 ways, because 30 times 4 is 120 but each possibility arises in 6 ways. There are 

3ut if A & 

if all three are alike there is only one way for each numb 


ther 

© are thr 

: á ee ways; i i H 
various ways ys; and if all are different there are six ways. 


er; if two are alike and one different 
The accompanying figure shows the 


[It fol å a 

lows, but is not stated, that the total number of ways is 

TA (6 x 1) + (30 x 3) + (20 x 6) = 216.] 

DM 1e à H x 
The ali rec figures are shown in Plates 1 & 2, taken from Harleian MS. 5263. 
m Wolfewen i r ed toaboveis also given in my Fig. 1, taken fromthe printed edition published 
total ae tel in 1662. The total of the last column is 108 which, doubled, gives us the 
videte: " of ways of throwing three dice. If this is a thirteenth-century product (the 
some of = seems to be fourteenth century) it is astonishingly in advance of its time; and 
aiies * 1e phrases have a very modern ring (e.g. ‘tria schemata surgunt’, three cases 

; quemlibet cum dederis, reliqui duo permutant loca’, if you fix one, the others 


ber i 
permute in two ways). 


16, ET : r 
There exists a medieval translation into French of the De Vetula, edited and published 


in1 
at Je ed H. Cocheris, who is mainly responsible for the theory that de Fournival was the 
easily i ap takes considerable liberties with the original text and is not always 
pu Le against it. This poem 1s attributed to the fourteenth century. So far as 
Shores the translator seems to have failed to understand the main point. He merely 
ANB es the 16 possible scores with three dice and points out that some of them occur 
en than others. The essential step in the De Vetula has been lost. 
17. In the sixth canto of the Purgatorio Dante mentions the game of hazard: 


Quando si parte il giuoco della zara 
Colui che perde si riman dolente 


(Wh Ripetendo le volte e tristo impara: 
en a game of hazard breaks up the loser remains behind mournfully recalling the throws and 


arni 
mng by sad experience.) 
com é $ : 
mentary on this passage published in 1477 says 
he dice are square and every face turns up, so 


C Š 
haw ‘Sou these throws it is to be observed that t 
More freque wa which can appear in more ways [sc. as the sum of points on three dice] must occur 
e throw. ntly, as in the following example: with three dice, three is the smallest number which can 
n, and that only when three aces turn up; four can only happen in one way, namely as two 


& 
nd two aces, 


* 
Tam 
but T in competent to express an opini 
author G ie aee A doubts on the poin 
e about dice. Some of the critical pass 


le 


Vetula to de Fournival, 
record what the 
s by later hands. 


1e attribution of the De 
rinted versions correetly 
however, be interpolation: 


on about th 
t if the later p 
ages may, 


6 Studies in the history of probability and statistics. II 


At this point the author seems to be on the verge of the usual fallacy that a three will 
occur equally as frequently as a four. But (more by luck than knowledge, in my view) he 
veers off this point and proceeds: ‘and so, as these numbers can only happen in one way 
at each throw, in order to avoid tedium and too long a wait, they are not reckoned in the 
game and are called hazards. And so for 17 and 18. . - -The numbers in between can happen 
in more ways; the number which can happen in most ways is said to be the best throw 
of the set.’ 

«06:059» 19 
Quinquaginta modis & fex diverfificantur 
In punóaturis, pundia tureque ducentis 
Atque bis octo cadendi {chematibus, quibusinter 
Compofitosnumeros, quibus eftluforibus vfus, 
Divifis, prout inter eos funt diftribuendi, 
Plené cognosces, quantz virtutis eorum 
Quilibet esfe poteft, feu q uantz debilitatis: 
Quod fübfcripta poteft tibi declarare figura, 


‘Tabula III. 
Qu Punclatura setavot Cadentias ha 


beat qvi ibetnumerotit compoliforum. 


(very unjustly) as Pascal’y arithmetic tri 


Nevertheless, if it ls the first ste hi 
A.D. 1500. P which counts, that step had already been made by. 


Biometrika, Vol. 43, Parts 1 and 2 


Plate 1 
Kendall: Studies in the History of Probability and Statistics 


y 


Af tp ouofilistmgnm prt varan 
P unctaniimoois mfionplicuiis afty. 
T uehtec aoiieto whquout quobz nie 
P woneesesmm qiforconatyticans- 
& Ofioiffilce finnt ommno orbitas 
T uncpremanag igna omunerabis- 
V cicoqeonanu whic mints. 
2 nammefe Tn A SUL ORI feo 
r 8 1oü0 conumu fiir pifconanut p 
TER Dt eat erang memes binc t?-bie-cr mie buosqr. 
GG hm Poab welarar culi {recat fia 
FE 1 CHCA | vfus fimt quoi Balm nacen. 
$65.60. 60, cce y Peyuncttiisdbnna mnai e. 
RBR. A r gulls (E erceanrícy ee (remo eren 
ELT 99g con y uncoukette nequi. o files fine neg 
29 2S ates D wod unin finery fic mms comm 
P m p mié file qouo-tifamara frp. 
meam p imihanaig fnpyicomcomm. 
M4 nema un J Couliles fit onte mucmco fev 
lala gaam on poffe mote q fitere tbm 
S95 O eme E Sans am ouo punctár tom fear 
(enm emer. qp Benet exec attmano f, 
C413 du oj Q Hidgmc MOIS er (evouifificang 
de. Gog con] 7 Bpuctine-prittianeo oncmag 
[oi | A ny bifocto carne emant qus may: 
Faas es el Sis [eee af LT. ls T a 


babel allel UL, babel 


Biometrika, Vol. 43, Parts 1 and 2 


Kendall: Studiis i st -robabi and Stalis 
lies in the History jj Probability and Statistics 


o>) 
& 4 


TJM pong folng marie ch covucro 


Plate 2 


T? 
S , "gle tA laasia | 
i Ex WA pitt 1 Gade | 3 | 
* tits, io a Eg 
fk E * 4 $8,109 eme = ea rA 
YË. ompotinsnsiog quit e ten iiig 6 ad ue orci 
D unfis put micos (Eoitribrienor- Ada Pese 2 SIT 


lene ORNO fE quang MUUTIS conum s n essem d SoBe“) 
Q miluve efft poteit ul'qriteecblitune- [9 zem 6 aai rd 
2, Siti prett cecdamie figuia den poet S GR) BA 
2 nor pricmamser qpacnnaf bácdbs nioxiopoiroate- - PINE 


d EA | 4 ps 
, A- Po cbionus mun yortc amic 
- e dculfoonmmr foe melioxent 


R comé qm etlucomgemüiagenor 
F ycluceng aium- fao hec eai abi mo: 
$ weere inane mooiaiualetacall-am 
R ecefiuaaasfarri comune Wp: 
2 liii foro opto pban fenuunt’ " 
uti emoac-btefpheiminfüpliat : , 


y oft obo pair aipugnaetia cmt" 
d ones malli fequic camceerre ens - 
dí olugmeíteafue méné fequic nt flug 


` Ü oo fi fozimá orate formi aque 


do cacegm ore monnfos qe hu 


f omma? ce cir acfoerumagoe alae’ 
A ccroreean orit fozunanffimuecttt. 


vt ccas forma kommit ciciuott crgo 
Puoicao eium queo femur i (tultus- 


a pax q mmis knots quic luan 


M. G. KENDALL 1 


19. Although the pioneer step in the De Vetula seems to have been overlooked by later 
writers in the next two centuries, the idea of enumerating the ways of obtaining given 
scores when permutations were taken into account must have been rediscovered by the 
beginning of the sixteenth century; for Cardano’s De Ludo Aleae contains the essential 
ideas and is dated on internal evidence as written about 1526.* Oddly enough, however, 
we find the first problems in probability (so far noticed in the records) in quite a different 
context. 


or Paccioli, was an itinerant teacher of mathematics whose 
ortioni et Proportionalità, published in 1494, was 
on of what later became known as the 
(not dice, but balla, presumably a ball 
s; but the match has to stop when 
takes be divided? 


20. Fra Luca dal Borgo, 
Summa de Arithmetica, Geometria, Prop 
widely studied in Italy. He considers à simple versi 
problem of points: A and B, playing ata fair game 
game) agree to continue until one has won six round: 
A has won five and B has won three. How should the s 


21. Paecioli makes "M heavy weather of this, but his solution amounts to saying that 
the stakes should be divided in the proportion 5:3. The error was noted by Tartaglia in 
his monumental General Trattato of 1556 (which date, we may remark, is thirty years after 
Cardano says that he was in possession of the basic principles embodied in the De Ludo 
Aleae). Tartaglia was always glad to point out errors in Paccioli with an acid superiority 
which foreshadows many of the modern writings on probability and statistics. He would 
have been more justified on this occasion if the alternative solution which he propounds 
had been correct, which it is not. He points out that according to Paccioli's rule, if A had 
Won one game and B none, A would take all the stakes, which is obviously unjust. He 


then argues that the difference between A’s score (five) and B’s score (three) being two, 
mber of games needed to win (six), A should take one 


and this being one-third of the nu P T E 
third of B's id and the total stake should be divided in the ratio 2:1. Or so I interpret 
ld appear that if A has v and B y games in hand when 


his rather prolix discussion. It wou E á 
the total number required to win is 2 Tartaglia’s rule requires that A takes a proportion 


$+ (w—y)|(2z) of the stake. 

22. Two years after the Trattato there appeared a short work by G. F. Peverone, Due 
Brevi e Facili Trattati, il Primo d^ Arithmetica, V Altro di Geometria. In the first of these 
Peverone considers a similar problem, without reference to other writers. A has won 
7 and B 9 games in à match going to 10 games. He gives two examples, which are 


effectively the same, and argues in SORES 

equivalently, the stake should be divided in the pro- 
go each would put two crowns [or divide the stakes 
inst B’s one, he should put 6 crowns against B’s 


A should put 2 crowns and B 12 crowns [or, 
Portion 1:6]. For if A, like B, had one are to p 
in equal ions]. had two games to go j y r 
[MU a ge Pan ee ee He would have won four crowns,} but with the risk of losing the 
Second after "i the first; and with three games to go he should put 12 crowns because the 


difficulty and risk are doubled. 


* Dr David informs me that she is convinced that Cardano obtained the substance of his work on 
gambling from other sources. This would be in accordance ms Me s character, for he was not 
an originator in spite of his extensive knowledge and peculiar gifts. On the other hand, it must remain 
a conjecture until those sources can be traced. A ` 1 

T ‘Se giuocassero a 1 giuoco, bastarebbero scutti 2; et & due giuochi Open che vincendo solo 2 giuochi 

He colo di perdere il secondo, vinto il primo: perd deve 


Buadagnarebbe scutti 4; ma questo sta con peri Si ; i : 
guadagnare scutti 6, et a 3 pen ochi scutti 12, per che si indoppia la difficoltà e pericolo.' 


8 Studies in the history of probability and Statistics. I] 


y : , NT second 
23. Ithink this must be one of the nearest misses in mathematics. As fai as the seco 


: À Jim. HER 
to go and is staking two crowns. th 


. 3 : ses 
"ell acquainted with geometrical progressions and u 


ersions of hazard and primero have been struck by he 
accurate judgement of probabiliti i Y- The chance of success of the firs 
player at craps, for example, should be | /2 and j 


f 
actually 244/493. the relative values 0 
flush and Straight at poker are c 


orrect although intuitively it is not clear what the order 
should be, Tt Seems, however, that this situation has been reached empirically and not by 


calculation. Cardano fortunately gives us an account of Primero as known to him.* From 
the pack of 52 the eights, nin s i leaving 40. Four were dealt to 


€ cards had Individual values, two Counting 12, three 
>an ace 16 and court cards 10. Ther 


four 14 and five 15; six coun 
were five combinations: 


(6) Primero (all cards of diffe 
(c) Supremus (the three cards 7, 6, ace in the Same suit); 
(d) Fluxus (four cards of the same suit); 
(e) Chorus (all cards of the same denomination), 
These were valued in that orde 
do not overlap and if two pl 
irrespective of suit, 
Now the chances of these events 


T, à Primero beating a numerus and 


. "es 
im 80 forth. The categorie 
Ayers held the same combination 


the elder hand won: 


» OF rather the number of w i ccu! 
on random drawing, are iti sio it 

horus 10 
Fluxus 840 
Supremus 120 
rimero 8,990 

Two of a suit 54,000 

Three of a suit 14,280 
Two pairs 12,150 80,430 
90,390 

* In the Middle Ages ma; 


à dozen different Versions of 
porated a draw for new cards 
ealled ‘prime’ which does, Rob 
primero, primo visto, Sant, one-and- 
mentions primero. I have seen j 


8. For exam ut 
! ple, there were abo 
in the ros dono does no Seem to have incor- 
91, makes a iba abeh 1, “oferto a later version 
“acter say « at will you play at» 
all be the Brig»: Did yi 


, § me (more Properly prj AM: akespeare also 
Spain on the occasion of the marriage of Mary Tudor With Phi ip Ir. Mera) Was imported from 
Suggest a Spanish origin, but I do not see why the marriage f Henry 8 used in it certainly 
could not have been the Occasion, or, indeed, that as 


. H r n 
ipecifie Occasion need be ivo deine of Arago 
zed, 


M. G. KENDALL 9 


In other words, the relative value of the fluxus and the supremus in Cardano's version 
Were in the inverse order of their probabilities. At what point the present (correct) order 
m the modern game of poker emerged, I do not know, but it seems to have been before 
anyone was in a position to calculate the chances and persuade his fellows on the basis of 
mathematics that the orders should be reversed. In my opinion relative chances were all 
reached on the basis of intuition or trial and error in the games played up to the middle of 
the Seventeenth century. 

25. It seems clear that in fifteenth-century Italy the basic problems of chance in gaming 
had been raised and some small progress made towards solving them. À more thorough 
examination of the Italian mathematical books of the period may reveal further evidence 
9n the point. One suspects that some of the simpler problems were circulated as a kind 
of puzzle, just as they are at the present day, without becoming of any recognized scientific 
Importance. Galileo in his fragment Sulla Scoperta dei Dadi, written some time before 
1642 (the date of his death), gives a complete solution of a problem in direct probability 
by correct enumeration of all possibilities, and he writes as if the problem were a new one, 
mentioning no previous authors.* Nevertheless, if Cardano’s treatise is to be correctly 
assigned to 1526 the ideas must have been current for a century before Galileo wrote. It 
would appear that a calculus of probability not only was late in developing but that, once 
begun, it progressed exceedingly slowly. 

26. Before we consider the reasons for this, something remains to be said about 
developments in France in the first half of the seventeenth century. The cradle of the 
Probability caleulus was undoubtedly, in my opinion, in Italy. From the fourteenth 
Century, however, there were close connexions between France and Italy of a political as well 
as a geographical kind and an intellectual movement in one often generated a sympathetic 
Movement in the other. The invasion of Italy by Charles VIII in 1494, though militarily 
and politically a failure, is generally regarded as a useful piece of intellectual cross- 
fertilization, Undoubtedly, a great many Italian works of art and ideas found their way 
to France with the remnant of Charles’s army, although I doubt whether a copy of Paccioli’s 

ook was amongst them. In this case also a search among French books on mathematics 
Written between A.D. 1400 and 1650 might prove to be very instructive. 

27. The lack of written references to problems in the probability calculus is not 
Necessarily indicative of a lack of contemporary interest. Knowledge of chances was so 
apacity to gauge them accurately in play was worth a good deal 
of money. Huyghens, visiting France in 1657, found intense interest being taken in the 
doctrine of chances among mathematicians but encountered also a certain coyness about 
the disclosure of results. This was presumably due to fear of anticipatory publication 
rather than loss of income. Huyghens, being the man he was, merely worked out the theory 


for himself. A Latin translation of his little book, De Ratiociniis in Ludo Aleae, printed by 
blished on the probability calculus and exercised 
nd Demoivre. 


rudimentary that any ¢ 


van Schooten in 1657, was the first book pu 
a profound influence on James Bernoulli a 

28. Now we come to the most interesting question of this period. Why was it that the 
Calculus of probabilities was so long in emerging? = cannot suppose that the Greeks were 
Incapable of making the necessary generalizations, even ifthey were hampered in working 


ion. Early writers on probability, like those of the present 


* mur s T 
This is not à very weighty considerat. y 
ness to their predecessors. Laplace was notoriously bad at it 


day, often failed to mention their indebtedr 


10 Studies in the history of probability and statistics. II 


out the details by their arithmetic and algebra. The same is true of the Arabs and of e 
early medieval Europeans. Dr David has suggested that imperfections in the di m di 
have something to do with it, but I cannot believe that this was a major reason. Some i 
the dice were, in fact, quite well made. The races which built the Parthenon, Bes. 
Column, St Sophia and Notre Dame were quite capable of turning out a few cubes as e 
as any of those in current use. Nor do I think backward mathematical notation had mu 


her possibilities are worth examining: 
(a) the absence of a, combinatoria] 
(b) the superstition of gamblers: 

(c) the absence of a no 
(d) moral or religious b. 


algebra (or at any rate, of combinatorial ideas); 


gebra does not seem to have been cultivated by the anoion 
e sixteenth and seventeenth centuries. Leibniz published à jow B 
is a De Combinationibus Alternationibus et P. pd 
© essential ideas could be traced back a good ce 
as really under way, a mene 
88, it seems to me, the absence of such an alge o 
emergence of the doctrine of chance. Cardan 


i ible 
ted unequally in the long run. But it seems quite € 
ncompatible Propositions; and with sufficient ingenul Y, 
I suppose, it is also possible to r i arge numbers with a belie 


ong intelligent peopl 
emergence of the probability calculu 
mental factor. The very notion of chance itself, the ide: 


that a Proposition may be true and false in fixed relative 
are nowadays so much part of ou 


that they were not so to our ancestors. It is in basic attit 
world, in religious and moral teac 


cline to seek for an explana- 
tion of the delay. Mathematics 


never leads thought, but only expresses it. 


* The origins of combinatorial algebra would the 


mselves make an interesting histon: study: 
Wallis, at the age of 25, established a reputation for himself by deciphering Royalist] MIR dim 
during the Civil War. Bacon, in the reign of Elizabeth L, also took a keen interest in cryptography* 
Both men used ciphers based on combinations of symbols. typtogr 


M. G. KENDALL 11 


32. The Greeks and the Romans (so far as one can make summary statements about 
races whose members held such differing views) seem, on the whole, to have regarded the 
world as partly determined by chance. Gods and goddesses had influence over the course 
of events and, in particular, could interfere with the throwing of dice; but they were only 
higher beings with superhuman powers, not omnipotent entities who controlled every- 
thing. And the vaguer deities, Fortuna, the Fates and Fate itself appear to modern eyes 
More in the retributive role of a personified guilty conscience than as masters of the 
Universe, The situation was radically changed by Christianity. For the early fathers of 
the Chureh the finger of God was everywhere. Some causes were overt and some were 
hidden, but nothing happened without cause. In that sense nothing was random and 
there was no chance. ‘Nos eas causas’, says St Augustine, ‘quae dicuntur fortuitae (unde 
etiam fortuna nomen accepit) non dicimus nullas, sed latentes; easque tribuimus vel veri 
Dei, vel quorumlibet spirituum voluntati. This view prevailed also in medieval times. 
Thomas Aquinas, arguing that everything is subject to the providence of God, mentions 
9Xplicitly the objection that, if such were the case, hazard and luck would disappear. He 
replies that there are universi and particular causes; & thing can escape the order of 
® Particular cause but not of a universal cause; and so far as it escapes it is said to be 
fortuitous with respect to that cause. St Thomas has an Aristotelean view of primary and 
Secondary causes but we need not follow closely his struggles with the problems of causality, 
Predestination and free-will. He reflected the spirit of his age, wherein God and an 
elaborate hierarchy of His ministers controlled and fore-ordained the minutest happening: 
if anything seemed to be due to chance that was our ignorance, not the nature of things.* 

33. St Thomas is sometimes quoted as having expressed himself in favour of a frequency 
theory of probability, but, in my opinion, this rests on a source of confusion which it YEN 

© useful to remove in passing. Throughout this article I have been speaking of the doctrine 
9f chances (which Demoivre translated as Mensura Sortis), not probability in the wider 
Sense, Early writers used probabilitas with a different meaning, as relating to the degree 
9f doubt with which a proposition is entertained. At the outset of our science the two 

. D AY : hat they have not remained so and that our language 

gs were distinct and it is a pity t y A 
has tended to confuse them. It seems to have been T Bernoulli who first thought of 
aPplying the doctrine of chances to the art of eonjestiue; and although we find appleations 
© the assessment of the credibility of witnesses as early as pM it i not until Bayes 
time (1763) that it was also applied to the acceptability of hypotheses. The resulting 


> r si d at the present time seems, if any- 

confusi : has existed ever since an : f y 
i ys Ta bin Pn any justification for the study of the history of probability 
8, to be getting worse. found simply and abundantly in this, that 


and Cay in it would be 
a sin eim Ficus s pincel of the subject would have rendered superfluous much of 
of the 


What has been written about it in the last thirty years. . 

34. Aquinas does not give a definition of probabilitas, csi = ME ' Probabilia 
us quae videntur omnibus, aut plerisque; ae F : a j. S FOL, ERREUR yal 

* The nia f course, has come down to modern times in Ses iota descent but in a less 
Relstio:ticms ordai waiting idi 1071, Baya, "for ting Oe “OF Dale icai - 
Teferon oe tee "Dedoionoy in our knowledge’; D’ Alembert, my Se E BONIS Eg de hasard 
Proprement parler mais il y a son équivalent: ^j agre quoique depuis les tr mel Bes ides 
Vénements * More recently Paul Lévy in 1939: ‘Nous penso ^ Pu ei 'avaux Heisen- 
beng d'éminents savants ne soient; pas de cet avis, que Ja notion t ^ i aig s> une notion que le savant 
troduit parce qu'elle est commode ot féconde, mais que een E 


12 Studies in the history of probability and statistics. JI 


plurimis maxime nobilibus et probatis,’ 
quality which gave rise to an opinion. H 
chance (casus), he says: ‘Ea quae accidun 


St Thomas himself regarded probabilitas F 
e says explicitly that it admits of degrees. j 
t semper vel frequenter non sunt casualia i 
fortuita, sed quae accidunt in paucioribus. And again ‘Sicut in rebus naturalibus in " 
quae ut in pluribus agunt, gradus quidam attenditur quia quanto virtus naturae A 
fortior, tanto rarius deficit a suo effectu, ita et in processu rationis qui non est M" 
omnimoda certitudine, gradus aliquis invenitur, Secundum quod magis et minus n 
perfectum certitudinem acceditur.’ As I understand the position, St Thomas recap 
that probabilitas preceded certainty in the formation of knowledge; and that the frequen 7 
o with the ‘fortuitous’ nature of the causality and the -— 
in his writings an explicit € 
were very closely related. The wr 
^. While to give it, but it seems plain to me th 


S . z s ion. 
the doctrine of chances was not present to his mind when probabilitas was under discussi 


t the religious attitude of the a 
udy of random behaviour. Even t d 
nation, but I think it very likely oe 
e feeling that every event, however trivial, happened unde 


Divine providence may have been a, Severe obstacle to the development of a calculus 9 


; ; to 
nity severa] hundred years to accustom itself 


à ts 
nts were without Cause; or, at least, wherein large fields of "n 
were determined by a causality so r uld be accurately represented y 


he 
: ysis ; Tumanity as a whole has not accustomed itself to t a 
idea yet. Man in his childhood is still afrai nd few prospects are darker th® 
to mechanistic law and to blind chance. ok 
Ppears undeniable that the doctrine of chances to 


à - Once launched, of course, it proceeded very ee 
there is only a hundred Years between Bernoulli’s Ars Coniectandi and Laplace's Traw 


e of Primero, to Prof, W. Rose 
an and French gaming, and t0 


++-Super vitam Pontificis neque super 
personarum ecclesiasticarum,’ i 


» but I su ose i 
chief early writers on the Probability calcul ; Pp accidental 
Galileo were both victims of the Inquisition, 
been driven from Antwerp by Spanish perse 
because of the revocation of the Edict of Nantes, 


: x ort escaped; but 
provinces and published nothing on probability, and Montmort also liveq ee 


M. G. KENDALL 13 


Dr F. N. Davi 
- N. David, with whom I have h i i 
S abs ave had many discussions on thi inati j 
ho read this article in manuscript. aiii iii 
REFERENCES 


Davip 

z ; F. N. (1955 as 

Lrnnr, C - (1955). Dicing and gaming (a note the hi: 7 ili ; 

,C.D.S8 1 Fane g on the history of probability). B iil 
S. (1838-41). Histoire des Sciences Mathématiques en Italie, 2 de ee dl 


APPENDIX 


Extract from ‘De Vetula’ 


Forte tamen dices, quosdam praestare quibusdam 
Ex numeris, quibus est lusoribus usus, eo quod 
Cum decius sit sex laterum, sex & numerorum 
Simplicium, tribus in deciis sunt octo decemque, 
Quorum non nisi tres possunt deciis superesse. 
Hi diversimode variantur, & inde bis octo 
Compositi numeri nascuntur, non tamen aequae 
Virtutis, quoniam majores atque minores 
Ipsorum raro veniunt, mediique frequenter, 

Et reliqui, quanto mediis quamvis propiores, 
Tanto praestantes, & saepius advenientes. 

His punetatura tantum venientibus una, 

Illis sex, aliis mediocriter inter utrosque, 

Sicut sint duo majores, totidemque minores, 
Una quibus sit punctatura, duoque sequentes, 
Hie major, minor ille, quibus sit bina duobus. 
Rursum post istos sit terna, deinde quaterna, 
Quinaque, sicut eis succedunt appropiando 
Quattuor ad medios, quibus est punctatio sena, 
Quae reddet leviora tibi subjecta tabella. 


Hi sunt sex & quinquaginta modi veniendi, 

Nec numerus minor esse potest, vel major, eorum. 
Nam quando similes fuerint sibi tres numeri, qui 
Jactum componunt quia sex componibiles sunt, 
Et punetaturae sunt sex, pro quolibet una. 

Sed cum dissimilis aliis est unus eorum, 

Atque duo similes, triginta potest variari 
Punctatura modis, quia, si duplicaveris ex se 
Quemlibet, adjuncto reliquorum quolibet, inde 
Producens triginta, quasi sex quintuplicabis. 
Quod si dissimiles fuerint omnino sibi tres, 
Tune punctaturas viginti connumerabis. 

Hoc ideo, quia continui possunt numeri tres 
Quattuor esse modis; discontinui totidem: sed 
Si duo continui fuerint, discontinuusque 
Tertius invenies hine tres bis, & inde duos ter: 
Quod tibi declarat oculis subjecta figura. 

dam subtilius inspicienti 

bus una cadentia tantum est; 
Suntque; quibus sunt tres aut sex quia schema cadendi 
Tune differe nequit, quando similes fuerint tres 
Praedicti numeri. Si vero sit unus eorum 

Dissimilis, similisque duo, tria schemata surgunt, 
Dissimili cuicunque superposito deciorum. 

Sed si dissimiles sunt omnes, invenies Sox 

Verti posse modis, quia; quemlibet ex tribus uni 


Rursum sunt quae 
De punctaturis, qui 


14 Studies in the h istory of probability and statistics. Ll 


Cum dederis, reliqui duo permutant loca; sicut 
Punctaturarum docet alternatio. Sicque 
Quinquaginta modis et sex diversificantur 

In punctaturis, punctaturaeque ducentis 

Atque bis octo cadendi schematibus, quibus inter 
Compositos numeros, quibus est lusoribus usus, 
Divisis, prout inter eos sunt distribuendi 

Plene cognosces, quantae virtutis eorum 

Quilibet esse potest, seu quantae debilitatis: 
Quod subscripta potest tibi declarare figura. 


[ 15 ] 


ON ESTIMATING THE LATENT AND INFECTIOUS 
PERIODS OF MEASLES 


I. FAMILIES WITH TWO SUSCEPTIBLES ONLY 


By NORMAN T. J. BAILEY 
Design and Analysis of Scientific Experiment, 6 Keble Road, Oxford 


1. INTRODUCTION 


The analysis of the household distribution of cases of measles by means of chain-binomial 
models has been comparatively successful (see Bailey (1955) for discussion and bibliography). 
At the same time considerable variation in the time interval between successive cases has 
usually been observed, and this sometimes leads to difficulties in identifying the links of the 
chain, An attempt was therefore made to produce a model which would take these variations 
Into account. The simplest feasible arrangement is to assume that after the receipt of 
infection there follows a latent period which is approximately normally distributed. Then 
Comes a period of infectiousness which is effectively terminated after a constant time by the 
&ppearance of symptoms and removal of the individual concerned from circulation. The 
latent and infectious periods taken together constitute what is usually called the incubation 
Period. Arguments in favour of this model have been set out in detail elsewhere (see Bailey 
(1954, 195 5), to which reference should be made for a fuller discussion). Simple estimates of 
the four parameters involved were given for families with two susceptibles only. However, 
at least two of the large sample variances were high, and there was considerable doubt as to 
the efficiency achieved. The purpose of the present paper is accordingly to develop a 
maximum -likelihood scoring procedure which will give efficient estimates and also make 


available a goodness-of-fit test. 
2. MATHEMATICAL MODEL AND SAMPLING DISTRIBUTIONS 
riod, x, is normally distributed with mean m and variance 


period is of constant length a. Infection of the second 
such that the chance of con- 


Let us suppose that the latent pe 
?*, while the ensuing infectious ; 
Susceptible during this time is taken to be a Poisson process 
Tacting the disease in time dt is Adt. 
. Suitable material for analysis (kindly made available to me by Dr R. E. Hope Simpson) 
18 shown in Table 1. This is based on families with two susceptible children under 15 years 
Of age and at least one case of measles, and was taken from the Cirencester area over the 
Years 1946-52, The distribution appears to involve two distinct parts which overlap a little. 
Distribution A containing A families is considered to arise from both susceptibles having 
een Simultaneously infected by an outside contact. The B families in the B-distribution 
are taken to be examples of cross-infection within the family. We shall assume for the time 
Sing that observations can be allotted to the correct distribution with complete accuracy. 
N practice an arbitrary decision (as in Table 1) may have to be made about borderline cases, 
refinement of analysis is to introduce and estimate an additional parameter, the prior 
Chance that a family belongs to, say, distribution A. The kind of procedure required is 
Outlined below in $6. There are C families with only one case, anda totalof N = A+B 40. 


16 Estimating the latent and infectious periods of measles. 1 
Let w be the variable for distribution A. Since this is the absolute difference between two 
independent latent periods the frequency distribution of w is 


1 E 
f(w) = oie | (Usus, a) 
v 


" T A s r 
Next, consider the B+ C families with either both a primary and a secondary case (B) $ 
a single primary case only (C). The chance of the second Susceptible escaping infection by th 


first case during the infectious period a is e~a, Hence the probability of the observe 
numbers, B and C, given B+C, is 


Further, the distribution of t 


3 :nfectious 
he epoch 7, measuring from the beginning of the infectiou 
period, at which infection oft 


he second case actually takes place, is evidently 

i (7) - Ae (1 m o 
Let distribution B arise from a v 
function 


(0€ r & a). (3) 
ariable z = x +7, Where of course w has the frequeney 
f(x) = (2192)- exp(— (%—m)?/(2¢2)) 


The frequency distribution of z is o 
frequency distribution for x and 7, r 


4 
(7o «xc oo). ( l 
btained by using (3) 


and (4) to write down the join 
eplacing a: byz— 


T, and integrating out 7. This gives 
R -A(G—mn-dAg?) 
fi dore mene I, ris di di 
Laer w V (27) ^ 
where ug (n Ag) and w 
We shall also need the sample me. 
with the observed second moment a 


= w—ag-, (6) 


an, 2, and variance, v, of the B-distribution, togethe" 
bout the origin, y = Xw?|A, for the A-distribution- 


9. MAXIMUM-LIKELHOOD SCORING 
Using the three frequency functions given in (1), (2) and (5) above we can proceed in the 
usual way to derive maximum-likelihood scores and information functions for the par^" 
meters A, a, m and o. In order to do this as concisely as possible let us first write 


"d 2 7 
POR | ene dt, (7) 
and T= Xx (0 = A, a,m, o). (8) 


Amalgamating the three contributions to any score then gives quite simply 


S,=0L/oa = Bin —8-EAC- Ant) — Qs ym 
Sq=0L/aa = -O Xm. 
S, =OL/am = Bayo (9) 
mo 
z 


S,=0L/oo =do-V(t Fasi) *BAta y 


Norman T. J. BAILEY 17 


where J is as usual the log likelihood. The quantities 7; are most easily calculated from the 


functions 
ja y” and P(x) =|" ooa (x> o| (10) 
=—P(|2|) <0) 


using the tables published by the New York W.P.A. (1942). We then have 
R=}(P-P'), eR/eA=—7(Q-2@): 
aR|ea = o7Q', Ròm = -0™-Q') | 
IRo = —o-XuQ-wQ) -234(9- 0?) | 

Where P=P(u), P'sP(w) Q= Qu), =u). 


The T, are then obtained from (8), (10) and (11). l 
The derivation of information functions also goes m à straightforward way. A minor 


Point worth mentioning is that if we differentiate T with respect to one of the parameters, 


Say Ø, we obtain 1 3R 
-To T; + R ET . 

The expectation of R-102R/09 00 is easily found, since when multiplying by the frequency 
function in (5) the factors R in numerator and denominator cancel, and the integration with 
Tespect to z gives no special difficulty. With Tọ T, on the other hand, the integrand involves 
a factor R-, and it seems best to leave these terms as observed quantities. In any case the 
Individual values of 7; and T; for each z have already been calculated in finding the scores. 
The information matrix, I, for the parameters A, a, m and g, in that order, then turns 


9ut to be 


(11) 


Yo os BM) s B 2). X m 
“Th — B 204 92 — A72). ESSE T° TNT, -B +o"), ETT, BA 


Ba Br : Bo 
ST? + ER 1 ETT. ela]? ZT, To + e^ a 
ASET ga * Aa — a. 
ET — BN, zm, T,— Bo 
XT2— Bite? 42407 
(12) 
Where B= (B+C) (1 —e7). (13) 


Writing S for the vector of scores calculated at trial values given by the vector 0, we calculate 
Approximate maximum-likelihood values, 0,, given by 
0,— 04 I3S, (14) 


as is well known. The procedure is then repeated using 6, for the trial values until sufficient 


accuracy is obtained. 
To obtain initial trial values it is con 
* previous paper (Bailey, 1955), namely, 
j= LL TREES 
a = A-1dog (1+ Y), 
m=z-à +a Y>, 


o = (VÈ, 
2 Biom. 43 


venient to use the approximate estimates given in 


(15) 


18 Estimating the latent and infectious periods of measles. I 


where T = Y?0  Y-1) (log (1+ YJ}, 
U =v}? (16) 
Y = BJC. 


: -y initial 
Ifit should happen that in any set of data A were small or zero then no satisfactory sud] 
estimate of c would be available from (15). However, it is still possible to obtain ro 
estimates by setting B equal to its expectation, leading to 


7 
Aa = log (1+ Y) = F, du 
and putting the first three sample cumulants in the B-distribution equal to their expecta 
tions, viz. 

Kim À-1— e(gha . 1) A 

Ke=0? + A- — atela(ua _ 1)-2 =v, (18) 


Kam 2A73 — a? gla(càa d 1) (era = 1)-3 = 


Using (17) and (18), we can solve to the required estimates succe 


a? = m,[2P-3_ Yele Y) (24 Y), 
o = vap- yoy Y), 
m-—z—a(pa.. qoi). 

A= Fa, 


(19) 


4. ILLUSTRATIVE EXAMPLE 
Let us now apply the foregoing maximum -likelihoo 
in Table 1. Examination of the observed frequenci 


nding deficit, : s). This 
could be due to a small unconscious bias towards sel a "ioni pu P: nitor 
s er of weeks. 
T VleWeos may show a preference for ag? 


frequencies for, say, days 6, 7 and 8 in one gro 
when carrying out a goodness-of-fit test, 
Preliminary estimates were obtained from (15) 


- These have 
earlier paper (Bailey, 1955) together with their ] 


arge sample va, 
A = 0-203 + 0-083, 

a = 8-13 + 3-28 days, 

m = 7-94 + 0-87 days, (20) 
o = 1:324 0-17 days. 


" si 8n 
already been given in ? 
Tlànces, and are 


The estimates of both A and q appear to have rather low n 

surprised if the maximum-Hkelibood values are apprecisp] 0n; and we should not K: 

accurate. After one cycle of successive approximation the gq éc P, and much d 

show large changes. When the third set of values was Obtaineq the ing estimates did in D 

recaleulated, and convergence was then rapid. Stability Was Practically «nee idees ye 
ieved wi 


Norman T. J. BAMEY 19 


d esit tin final fifth stage gave only very small further corrections. The 
ihood estimates found in this way are 
À = 0-256 + 0-032, | 
á = 6-57 + 0-76 days, 
M = 8-58 + 0-32 days, n 
G = 17172013 days. 
compared with the preliminary estimates appearing 


r 
There i "t 

ere is thus a striking gain in efficiency 
ation is evidently worth while. 


in (2 "e 5 
"s and the additional labour in comput 
E hen carrying out the usual goodness-of-fit test it is convenient to use the last approxi- 
ation but one, if sufficiently accurate, since we need not then recalculate the R(z) in 


ee the function given by (5). In practice, of course, the data will normally be 
Stouped in units of 1 day, as in Table 1. The fitted values for distribution B were obtained 
cted values for Hope Simpson's data on 


Table 1. Observed and expe 
h two susceptibles 


measles in families wit 


Time interval Observed no. of families | | 
bet 
A ener Expeeted no. | 
in days A B Total | 
a — = 
0 5 5 4-67 
1 13 13 8-58 
2 5 5 6-73 
3 4 ; H | 4-53 
4 2 1 ) 5 | 2-78) 
5 ; 2 IJe — | 2w 909 
6 4 4 | 3-97 
7 11 tA | 8-85 
8 5 5 | — 16:63 | 
9 25 25 | 24-72 | 
10 37 37 29-44 
11 38 38 29-28 
12 26 26 | 2544 
13 12 ta | 1999 
14 15 15 | 1428, 
15 6 6 9-02 
16 3 3 4-82 
17 1 : 2-09 
3 0-71 
18 8 
15 ogy 784 
20 | 0-04 
21 x 1 210 | 0-00 
^ Sub-totals 29 190 | 219-00 
One case only (C) o | 44-11 
Primary and secondary (B) 190-89 
Overall total (4 + B-- C) 264 264 


20 Estimating the latent and infectious periods of measles. I 

merely by calculating f(z) at the mid-point of each interval, as the small additional see, 
resulting from integration over the interval was thought not worth the extra a 
computation. On the other hand, integrated values for distribution 4 are immedia 
available from the tabulated values of P(x) in (10). Itc 


"m at the 
an be seen from Table 1 that € 
agreement between observed and expected v. 


; A su 
alues is on the whole quite good. For the a 
goodness-of-fit test the classes bracketed together have been pooled so as to avoid sm4 


expectations. There are sixteen classes for the combined A- and B-distribution, ges 
15d.f.: and two classes, giving 1 d.f., for the numbers B and C. From the total of 16 d.f. We 
must remove 4 to allow for the parameters estimated. We find 
12d.f. As the 594, point is at 21-0, we can regard the fit 
already remarked the possibility of unconscious bias i 
at 7 and 14 days, and suggested that it could be minimi 
for 6, 7 and 8 days in one group, and 13, 14 and 15 in 
a X? of 12-9 on 8 d.f., which is entirely satisfactory sir 


an overall y? of 20:3 e 
as just adequate. Actually. we ae 
n the records producing local pe? E 
zed by amalgamating the faga. 
another. When this is done we obt@! 
ace the 10 9, point is at 13:4. 


Table 2. Efficiencies (in percentage) of estimates with parts of data absent 


Data available 


All three sources present | 
No double primaries (A = 0) | 79 70 | an 0 
No single cases (C = 0) | d id : 
| B-distribution only (4 = 0 = C) 


It is also of some interest to see what w 
missing. There may, for example, be no families with 
have to do is to remove the term 2Ac~ from J wor in ( 
matrix. Alternatively, there may be no record of C, 
case only. No information about A and ais then available from the frequency distributio! 
in (2). Accordingly, we must remove from 7 aa Zaa and 7,, the contributions a2, fad a" 
pa*, respectively, where f = (B+C) (ede , 


—13 : mp 
vs — 1). Again, both these items may be missive 
and we may only have data on families with a, primary and a, Secondary case. These results 


are summarized in Table 2, which shows the appropriate efficiencies derived b comparing 
the variances of the estimates in each case with the ‘best’ y 


: values given by the squares of th 
standard errors appearing in (21). It is worth noticing that thsennatile amne estimate? 
of all four parameters can be obtained in the absence of double primary data, but that 
knowledge of the number of families with a single case y Sa 


: is essential for an efficient dete" 
mination of A. Without the B-distribution, of course, little information of valuesvould be 
forthcoming. 


ould happen if certain portions of the data e 
à double primary, i.e. A = 0. All he 
12) before inverting the informatio! 
the number of families with a sing! 


5. EFFECT OF VARIATIONS IN A 
It has been shown elsewhere (Bailey, 1953) that, so far as measles 
Rhode Island, were concerned, chain binomials gave a satisfa 
analysed stage by stage only on the assumption that the chang, 
between families. In the notation of this paper we have 


data for Providence: 
n 1 

ctory goodness-of-fit whe! 

eof cross-infection, p, varie 


p=l-e™, (22) 


2 


Norman T. J. BAILEY 21 


It follows that we ought in the present context to consider the possibility of variations n / 


We could hardly expect to make any precise estimate of parameters in the distribution 


| aman a is f the 
of A from families with only two susceptibles, but it is worth making a one ene ii 
likely consequences. We can do this by calculating the y a ee N qoae ibution 
priate variations in A, of the expected frequencies in the model. Now the distr j 

chosen for p was 1 


" (23) 
ae = B(x. y) 


pou — py? dp (0<p<!}). 
and the pooled estimates of a and y, based on households of three and four, were 1-18 and 
0-28 respectively. The mean and variance of p were thus 


p= al(wt+y) = 0-81, | T 
v, = aul (o +y (ry 1)} = 0:063. l l 

1 e see that the value of p given by (22) is about 0-81. 
from the Providence data. If we now suppose p to 
for the observations B and C will 
ne same since it involves only c. 


Using the estimates of A and a in (21)w 
the same as the estimate in (24) taken iiem 
vary about this mean value the expected freque 


. eae 4ll also be tl 
remain unchanged. The A-distribution wil n be replaced, approxi- 
However, the Ej distribution given in (5) will be modified, and may be rep , app. 


3 pia / Of oe or 
mately, b 3? Y moe Sb (25) 
P" fes efr e CC 


_ »—7p and have taken expectations, 

Where we have simply ex anded in powers of dp = p—p an rah Siw. additional 

hegleoting pero py pam higher order in òp- P aay ea cont an um oem 

quantities on the right-hand side of (25) are all easily sii "e htly towards the origin the peak 

Tt turns out that the net result is to flatten and displace s ok E i "c Ai. turi ec 

of the f sien "Tho v2 values are a little higher. The data E of : : a 

tted curve. The XV" a of 14:3 on 8d.£., which is still satisfactory. Without this 

N g ve aX” : rem 5 0, " 

"When : = Se am 2 q.f., which is just y eas " e s oe pare 

the net oe M lur A)are unlikely to be as large as e op mn ide E phe i 
should ae ah more homogeneous than the Providence mater. a 


'eciably influenced by only moderate 
bably not be appreciably j 
conclude that our results would probe's 
Variations in the chance of i undertaken in respect of variations in the length of 
A somewhat similar analy: sis X * complicated in that both the A- and B-distributions 
thisis mo dependently variable infectious periods. 


the infectious period. 4, but sidering two ir 
are affected, and each involves con Je scope for introducing any substantial 


re 18 litt 

Prin with measles i» any ete pis pa be seen from the fact that if the variance of 

riations of a into the presen apna about the origin of the A-distribution would be 

Eo 2v,, and the observ wa although the effect on the gooc ranis SH WOU be very 

, say, approximately ut Ya ycerned, there would be an appreciable flattening of 
Small so far as the A-distribution is cor s further investigation. 


require 
the B-curve. This is clearly 2 matter that T " 


ALLOWANCE FOR MISCLASSIFICATION OF CHAINS 

6. ALLOWA^ : FN" Á 

S hepa chains of the mathematical model (almost trivial for 
0 far we have assumed that the ‘ced. For the class with only one case there is no 


sie r rectly identine®: Rae 
a of oe io ies ss m two cases; however, and the A- and B-distributions 
Ossibility of erro” 


Edni. “sy. Research 
i ING COLLEGE 
em 


Dated 


22 Estimating the latent and infectious periods of measles, I 


overlap, then there is a definite chance of misclassification. In the example discussed in $5; 


3 erlap only toa very small extent, the effect is likely to be only 
a minor one confined to a few borderline observations. With more substantial overlapping 
it is desirable to make Proper allowance for it in the analysis. This can be done by introducing 
a new parameter, £ = 1—7, which is the prior probability that a family is of type 4- 
Maximum-likelihood Scoring for the five unknown parameters, A, m, a, c and £, can then 


, considerably more complicated than the case 
ion only ofthe modified procedure will be given. 
ile elsewhere if data requiring it appear. 

om the number of families, C, with one case 
Y Nema, and this must now be referred to the whole 


sample, V, and not merely to B+C as before. The analogue of (2) is thus 


N 
(c) J e-Cha(y _ N) e72yv-c. (26) 


ja enis n N -C families Which are liable to misclassification with regard to the 
a a = Ne if We write the frequency functions in (1) and (5) as f, and fy, the 
na s ton to the likelihood from any family with two cases Separated by me interval o 
Ef(y) tad =e") fly) (27) 

=- 7" e-Aa 


Combining the contributions from (26) and 


L = Clogj— Cas 4 2 log (Ef y) +y- mAy), a 


: : 3 S 

that the value of y is such that one of the tw, i i. ais tdtlicli mue 
0 fr s is i 

In doubtful cases it may be safer to take 4 x ee Aly) and Joly) is negligibly o 


=B= 0. or t} * ilies 
we therefore put f, 2= O and f, = 0, respectively, This ne : da m E a 
L= Alog£+(B+ C)log ; — CAa + Blog(1 ~e-Aay ed expressi 


+Dlo, 
EEA utere ee rang dH 
The expressions (28) or (29) can be used as a basis for 
the scores and information functions are now much more compli 
components that are the same as in the simpler Situation en 


ank Mrs Tamara Hazlewood 
ts of this Paper are based- 


Barey, N. T. J. (1953). The use of chain-binomials with 
of intra-household epidemics. Biometrika, 40, 279. 
Barrey, N. T. J. (1954). A statistical method of estimating the Periods of ; 
an infectious disease, Nature, Lond., 174, 139. °F incubation and infection of 
Batrny, N. T. J. (1955). Some problems in the statistica] analysis 
Soc. B, 17, 35. 
New Yonk W.P.A. (1942). Tables of Probability Functions, vol, 2. 


of epidemie data. J. p. Statist- 


[ 23] 


THE BEHAVIOUR OF AN ESTIMATOR FOR A SIMPLE BIRTH 
AND DEATH PROCESS 


By J. H. DARWIN 
Applied Mathematics Laboratory, D.S.I.R., New Zealand 


]. INTRODUCTION 


The gj : 

i: H Simplest birth and death process is one in which there are constant probabilities Adt 

itt dt of an individual of a population respectively giving birth to a new individual in 

ihs dt) or dying in (t, t4-dt). If there are N, individuals of the population alive at time 0, 
Probability generating function (p.g.f.) of the number alive at time 7 is 

oo No E 2 a(À— u)? A” (æ — Lets No 

— [un(e — 1) (A= 

(Eme) =a 


E pecore] 


n=0 


Aa —q—A(x—1)2 


a.d bz] 
= [zz]. () 
mea where a is written for exp ((A—/) 7). Thus the probability that the population size is 
at ti : 
time 7 is a) No min (u, No) (Ny /N4n-r— ? (y gr 
() > ey ach aj v] ` 


The question of estimating the constants A and x has been considered by Anscombe (1953), 
Kendall (1949, 1952) and Moran (1951, 1953). In his 1949 paper Kendall discusses the case 
M which it is known that 4t = 0, and observations are taken at times 7, 27, ..., kr. When 
“= 0 it is possible to express the probability of a population size n ina much simpler form 
and find the maximum-likelihood (m.1.) estimate of A. In the other four papers it is mainly 
*tpposed that observation is continued until a certain number of events (births or deaths) 
jas occurred. Such observation may not always be practicable, but observation at regular 
tervals may be feasible. The complex form of the above probability then almost prohibits 

ts then used must preferably have 


he use of m.l. estimation. The estimates of the constan 
Only a small bias and must be reasonably accurate compared with the m.l. estimates for 


cont; à 5 , 
Ontinuous observation. We discuss the bias and variance of an estimate of a when k 


ied Spaced population counts are made. If it is also known how many events have 
“curred by the end of these k counts, the m.l. estimate of “/A for continuous observation 


1s available. 
8 2. THE ESTIMATE OF THE RATE or INCREASE, exp ((A—/) 7) 
i "pose the population, known to be Np at time 0. is counted at times 7, 27, ..., br and that 
A Size at time ir is N;. Then the m.l. estimate of a=exp ((A—#)7), (a) when it is known that 
= 0 (Kendall, 1949) and (b) when it is known that A = 0 is 
MEN te 


Xy "NEM so Nea (2) 


24 An estimator for a simple birth and death process 


It is suggested that X x be used to estimate z when neither À nor x is known to be 0. The 
estimate is intuitively a reasonable one, since the average value of the ratio of corresponding 
terms in the numerator and denominator is z. X, is in general biased (see $2-1) and demon- 


strably inconsistent (see $2-4) as k tends to infinity for a range of values of 7 for given 


k z 
A, u and Ny. However no simple unbiased estimate of x suggests itself (e.g. (1//) ENN 


requiresastopping rule if, for some j, N; Nj... ..., N, = Oand itsaverage value, relative to this 
rule, will in general be biased). Also X, has the virtues that its bias (see $2-2) and variance _ 
(see $3) tend to zero as N, tends to infinity, and that its asymptotic variance then can be 
made of comparable size to that of the m.l. estimate of x for continuous observation by 
making observations over a greater period (see $3-1). 


2-1 Bias of X, 
In finding Z(X,) we may average for X, first with respect to N, for a fixed N, ...,Njv 


k 
then average the result of this over AN, for fixed N, +++V,_9, and so on. 


Thus 
CIN I) = plate. +N» TAL +g) (3) 
E(X,|N, ..,N, .) = z| ees am , 
since E(N,|N, sera &N, 4. 


When averaging over N;—ı we observe that the right-hand side is convex from above in 
NN... Then Jensen's convex function inequality for N;—ı shows that 


Nit -+N +N. rer] 
E(X,|N, ...,N, < E| M ima TN Bat 
(GN, N-9«d Nj. EN, Na Lra) 


The same argument repeated till Ny is reached shows that 
E(X;) <a & exp (A — y) 7). (4) 
Since log z is also a function of æ convex from above 


E(logx) <log H(z) and (log X;) «log Z(X,) <loga = (A—p)r. 


That is, X risa negatively biased estimate of œ zexp (A — u)7), and log X, is a negatively 


biased estimate of (À—4)7. A closer consideration shows that equality in (4) is never 
attained for finite A, and k and positive 7. 


2-2. N, large 
TThis bias becomes small when N, is 1 
and A, B, C and D are positive consta, 


k + Ba 
+a) 


3 ae "jable 
arge. For, Suppose z is a positive random variab 
nts. Then 


E 


a 


= gp|4+Be D(A + Bx) (x — D(A + Be) (s — zy? 
E = EC) 
ese z| (C + Dz} eet (Cs Dea] 
5 At BE Ferns. 


_ [DB var (a) 8) 
CD: | (C+ Dap ( 


bo 
e 


J. H. DARWIN 


whence 


gt EN +My (l a) " p 
eren relin 
Mto tM g tM ofl rara?) (14-2) YN, 


Note +My at Mol $e) (ae ee (6) 


w. NE d i; e : 

bea pow Qa [Ne-o and in fact y = [(A -- 10) (A. — ual 1). If we continue this 

IE ED m rst term of the right-hand side downward through N;. ;. ...,/N, we achieve 

(L4 dos terms of the same sign. The above one is numerically less than or equal to 
a) ya-?/N,, and they are all of this order in Ng. In fact the bias is 


<(y/N) [1 (1 +a) 47? +... (laa... a) (lac... er) TI, (7) 


ar is 
nd this tends to zero as N, tends to infinity, for fixed a and 4. 


2.3. Leading term for large Ny 
x) formed by the average value of the quadratie 


We use the usual asymptotic value for E(X 
(N, — N;). Suppose X; is the 


term in the T. à D E 
: ‘ie in the Taylor expansion of X, in powers of (N,—™), «++» 
alue of X, when each N; is set equal to N;. Then the bias 


k um 
= Y MeeX;{/2N,2N) cov (NN). 


i,gj=1 


a 


The cubic term is of order 1/N$, and 
cov (N N;) = Mai(ai—1)(A +a) u) for jzi. 
The bias is, to first order in 1/No, 


(A--g)(x—1) rk (ie T 

NA p) (25— 1 [koc — (k — 1) a a] 
A —1y 

- n Th tal la ee eal +a} (8) 


I: becomes large for « greater th 
ests that the behaviour of lim E(X,) 
k>% 


Thi . : ; 
us term disappears when an 1 but remains finite for « less 
when J is not necessarily 


t 
han or equal to 1. This sugg 


lar 
arge may vary with the value of a. 


2-4, B(X,,) for large k 
me have been unable to find results for large k holding for the whole range of j//A, but 
Se given below indicate that the bias is most serious when. À is less than or equal to jr. 
d rn is to be expected as the probability that N; is 0 then tends to 1 as k tends to infinity, 
s Tis fixed. There is thus a high probability for large k that the last ratio of corresponding 
s in the numerator and denominator in X, is 0 which is less than a. 
may divide the process of averaging X, over the values of Nj, .....N, into 


(a : 
) summation over XN, for any fixed Ny ..., Me Then 
) of which no member is zero and 


b : a T 

i ) summation over all the sets (M. «++» Na 

€) summation over the (k— 1) groups in which the first member to be zero is in turn 
b... N, 


k-1* 


26 An estimator for a simple birth and death process 


Then £(X,,) has contributions from (b) and (c) after (a) has been done. 
(a) First for any fixed N, Nozes Nea 


E Nyt EN ENS +2) 5 bN N 9) 
Z X, Prob (M, ...,N,) | rms um TON. seg Via) ( 


(6) The ratio in (9), for any set of positive values of N,, ...,N;,_, is less than or equal to 
k-1 

1-2. The contribution to E(X) from (b) is then less than or equal to (1+«) ( - x r) 
= 


where P, is the probability that N, is the first zero N, i = 1,2,..., F, is the total probability 
that N, is zero, minus the total probability that N, is zero, 


(ee) "ec L 


=i plat- —1)\% . . . 
Then £ P = (SE ) and the contribution from (b) is less than or equal to 
1 i / 


Aa pu 
(a0 — 1)\ No 
1 = (a 
sap Go.) : 


(c) IfN,, when r is one of l,..., k— 1, is the first N; to be zero 
X, = 1-NJ(N 4... N, 1, (11) 
and the contribution to E(X;) from (c) is less than or equal to ES I5. 
Hence from (a), (b) and (c), i 


k-1. ]WN, d 
Msn: ee e 8 


Now if A is greater than 


4 this upper bound is approximately 1 to — a(u[A)No for large k. 
The inequality 


a> (A[p)Ne (13) 
determines, for given À, u and N, a value To of 7 such that X, is inconsistent with respect 
to k for rz r,. 


If the term N(M, +... EN.) of (11) which has so far been omitted is taken into account; 
the upper bound in (12) is reduced by 


k—1 
2 - UN (S, +... --N; ,)] Prob (N es Neah); (14) 


lon is taken over the non-zero values of N; SN, a, when N, is zero: 
14) might for some values of À and y considerably lower the uppe” 
No general necessary and sufficient conditio? 


—l-(Ar[(1--Ar))Ne for large k. (15) 


Hence there is always inconsistency no matter what finit 
If A is less than ^ à similar use of the first term of ( 
sistency if N, = 1. But for N, greater than 1 Such us 


e value N has. 
14) shows that there is always inde 
e involves conditions on A and ^ f? 


J. H. DARWIN 27 


ir ái 
\consistency to be demonstrable. If, however, only the second term of (14) is included we 
may prove a stronger result. This second term is 


; 1—a)\™ 
EUV/AS HN) (62) ' Prob (N), 


iun ie summation is over non-zero values of N, and Prob(N,) is that given by the 
oefficient of zi in the p.g.f. (1). Then by Schwarz’s inequality the term has a value 


> Ny PILE +AA -aMi Aad Prob (1 


> P3/(P, +æ). (16) 
H - : MEREN 
ence EQ) «1- B- Pl ea) eai (OSE) | 
(17) 


=1-P,—Pi(P,+) 


for large k, and any « less than 1. 
Suppose æ is small. Then this lower bound 
zaN(qu- AX)QONQ — A+): 
making æ small enough for this lower bound 


Itis always possible therefore to find aT, say 71; 
respect to k for r greater than 74. 


to be less than æ. Hence X, is inconsistent with 


2-41. The case p = 0 
That consistency with respect to & is possible is shown by the limiting behaviour of Z(X;.) 
When it is known that x = 0 and X; is used to estimate exp (Ar). Then the joint p.g.f. of 
oN, -Ny can be written down, and hence that of N, —N, and Not... TN Suppose 
this is plu, v), where u and v are carriers for N, —N, and N, 4 ..- +N; respectively. Then 


19 
EQ) = 1 | a (Fl adf as) 
It follows from (1) with = 0 that 
s B (19) 
T pluu) = Ez (a — 1) u(a 7 + av n =| i 
T hen 
E(X Ses 1N, vNok Taek = v^) (a- v)%o 
i) 1- (a vfi (a (1 — v) + (a — 1) v) 


1 N, pNok-Vak — 1) (a—- 1)%e 
bitan f Marae 
a faat —v)+a-— 1) 
0 ^ 
osea o) ac — 17 de. 
J 0 Ne 


M 


Il 


1+(a—1)Nott(1—@& 


-k) — (a — 1) (1 —a-*) Nb - 1 


Il 


l+(æ&—1)(1—& 
NK —v)-d- a — 1) «dv which ean be evaluated. 
-*). Hence the right-hand side has 
qual to æ. Hence lim £(X;) 


k->% 


The; 
© Integral here is positive and less than 


and for Ny = 1 it is O(ko 


For it is O(a- 
N, greater than litis (a i ‘) (X ) is less than or e 
k a : 


g er its limit as k tends to infinity. But # 
Xists and is a. 


28 An estimator for a simple birth and death process 


2-42. The case A = 0 


The actual limiting bias can be found when it isknown that A = 0, so that X, isan estimate 
of exp ( — 47). By the same method as was used in $2-41 we find that 


" 1 
E(X,)-1- X41 —a)% | 


JA , 
33 ——— GU. 
o (1— av)» 


(21) 
For example for AN, = 1, when the bias is biggest, 
22 
E(G) > 1—((1—a)/a) log [1/(1 — 2]. (22) 
This takes values 0, 0-107, 0-234. 0-389, 0-598, 0-744. for a — 0, 0-2, 0-4, 0-6, 0-8, 0-9. 
respectively. 
3. THE VARIANCE OF X, FOR LARGE N 
§ 2-4 shows that there is not in general an improvement in the average value of X, as A 
becomes large, but § 2-2 shows that a large N, means that the bias is small. It is thus natura 
to consider the accuracy of X, rather for large N, than for large k. We shall use the variance 
of X, for large N, as a yardstick for measuring the efficiency of this type of regular observa- 
tion compared with continuous observation, 
The limiting form of the variance of X, to order ] [Ng is 


OXL0X, : 
ux x COV (N,,N,) = 
ij=1 ON, ON, oN) 


The coefficient of 1/N, is small for large k when g is greater than or equal to 1, but for æ less 
than 1, increasing % gives little improvement, 
We wish to compare this variance with that of the m.l. estim 


ate of a when observation » 
continuous, The justification for the use of the m.l. estimate is that if N is large there 15 
effectively a large number of independent replicates. Suppose such observation is made 


over a period ¢ and that r events have happened at intervals 7,, Ta, ..., T, S0 that 


Tit Tg... T.I. 


Otuaa- 1)? (23) 
NMA =p) (ak — 1) 


Then the likelihood of this h 


; Fae f 
“ppening when there were A, members at the beginning 9° 
Observation is 


PUMA E Mr, m exp[— n tunn. exP[- n, 40 -4)7,]n, av, 
xexp[—»,(A4- n) (t— (r, 4-... 47). (99 
In this v; is either A or /^ depending on whether the (i+ 1)th event that has happened 15 
a birth or a death $ Migi 18 n+ Lif pv, is A and is ni—1 if v, is u. Here no =N. Then the log 
likelihood is a 
L=- (+u) X + X logn; + Blog À+ Dlog y, (25) 
0 
where X = VT, + Ny Teo. Np aT + NAb (Ti+... F 75)); 
B is the number of births, and 
is (B—D)/X. The variance of 
to the first order in 1 IN 
xe [8 - Dy B-D 1 E 26) 
El v; VarX-9 xs Srii Di ggv], ( 


D the number of deaths in time t. The m.l. estimate of A Pi 
the m.l. estimate, exp ((B—D)/X) of a =exp[(A—y) t]. 19 


= = 
x4 À 


where B — D and X are mean values, 


J. H. DARWIN 29 


Now $-D= 5M, 
E(B—D) = N(2,—1). 
"E var (B-D) = NA+ aala DA). pit 
Again, since X= [ndu 
Jo 


E(X) = Mu- D/A- 


tpt t r 
E(X?) = E(n,n,)dudv = 2 | du " (n) 2-9-0) dy etc. 
v , ? D 
JoJo o Jo (28) 
rt 
and B(X(B=D)) = | Zum - N)) du 
0 


" = 
NS eQ- 00 du — X Ns, ete. 


var [exp (E zm ‘I JU + 1) a% (log ou)” (29) 


Xx J| MA-M la) 


Then for large Ny = 


29) with (23) when kis 1 and tis taken as r. Since this is a comparison 


W 
Ve may first compare ( 
vation over 7 with the variance 


19) : 
E variance of the m.l. estimate of æ for continuous obser 
X x for one observation at 7 we may define the ratio of these variances 
of X,. This ratio is m - 
HA- Pp 
a Eit IT n ag (30) 
sinh (4(A —4)7) 


r 
Chis j 
his is the same formula for A— x as was Kendall's formul 


as the efficiency 


a for A (Kendall, 1949). 


3-1. Further comparison of efficiencies 


: 
Suppose a population is known to have a large size Ny at time 0. Then we may compare 


the accuracy of the following estimates of «=exp (A — 4) as 


(a) X, when the population is counted only at T, 27, .... Fr, and 
ntinuously observed over the interval (0, £). 


(b) exp ((B— D) | X) when the population is co ; 
i he limiting variance of X; for large Nis (23). That of exp ((B — D) r| X) is (29) multiplied 
Y (r]t)? 2-27. Suppose t/r = u. This latter variance is then 


(A +u) og a)” 

NAI) 
k 7 , 
lence the accuracy of the estimates (7) and (b) for large M is the same if 
at—1= (ak- 1) (log a)*/(a — 1)”. (31) 


E given c and k a unique solution of this for u always exists. For, the left-hand side is 
YS monotonic in u for a given &. Me 
infinit is greater than 1 the left-hand side goes from zero to infinity as u goes from zero to 
Y, and this range includes the value of t 
* tends to 1, u tends to k. 


he right-hand side. 


30 An estimator for a simple birth and death process 


If z is less than 1 the left-hand side goes from zero to — Las u goes from zero fo infinity. 
By comparing the derivatives of loge and (x — 1)/,/% we can show that z(log x)*?/(a — 1)? is 
always less than 1 for all positive æ other than 1. Hence the range zero to — 1 includes the 
value of the right-hand side. 

It also follows that, except when g = 1, kis always greater than u. That is, Er is greater 
than f, as indeed is intuitively obvious as (a) can in general only be as efficient as (b) if X, 
embodies some information not available to (b), i.e., if at least the last of the / observations 
is made after ż. 

When g approaches | the rate of change in population size becomes small and continuous 
information becomes of little extra assistance in its estimation. 

Table 1 gives the values of u for k = 1 (when (23) is exact) anda range of values of g. Then 

u = EL + a(log a)?/(a — 1) 
loga : 
We thus have the following result. 

We know that a population has a large size Mat time zero and wish to make an estimate 
of its rate of increase when it develops according to a simple birth and death process. The 
most natural estimate to take is the m.l. estin 


late formed, as in §3-0, from the record of 
continuous observations of the population over a period t. 


Table 1 
J 
| a | u a | ü 
— x S MM 
| | pneri 
| 01 | 0-386 2.0 0-971 
| 0-2 0-649 3-0 | 0-941 
| 0-4 0-890 4:0 0-916 
| 0-6 | 0-972 »0 | 0-898 
0-8 | 0-955 10-0 0-838 
| 1-0 | 1-000 
This section and $2-2 show that, the estimate X. 1 formed from a single population count 
at time 7 is approximately unbiased, and of as low a variance as the m.l. estimate 
derived from continuous observation over (0, t), if 7 is greater than a particular value, ¢/%: 
greater than ¢, Fi 


4. ESTIMATION OF Jà 
If Nis esi. are the only observed quantities 


vailable. Then, since the m.l. estimate of [A is, from (25), DIB, 
o- The theoretical difficulty 


J. H. DARWIN 31 


by MN being made Jarge enough. The probability of there being no events at all in time 7 is 
i [-N (À+ 4) 7] and of there being no births [p/(A.4- p) *- (A/(A 4- j()) exp (— (A+ 2) 7)] s. 
both of which can be made small for non-zero A and ju by an increase in Ny. 


lam greatly indebted to the referee for his suggestions for the improvement of this paper. 


REFERENCES 


ANscouBE, F. J. (1953). J. R. Statist. Soc. B, 15, 1-29. 
KENDALL, D. G. (1949). J. R. Statist. Soc. B, 11, 230-64. 
KENDALL, D. G. (1952). Ann. Inst. Poincaré, 13, 43-108. 
Moran, P. A. P. (1951). J. R. Statist. Soc. B, 13, 141-6. 
Moran, P. A. P. (1953). J. R. Statist. Soc. B, 15, 241-5. 


ac 


[32] 


EXAMINATION OF A QUANTUM HYPOTHESIS BASED 
ON A SINGLE SET OF DATA 


Bv 8. R. BROADBENT 


The British Coal Utilization Research Association 


l. INTRODUCTION 


Inan earlier paper (Broadbent, 1955) the following situation was discussed : an experimenter 
makes the observations Jr Yz -Yn and wishes to com pare his data with the quantum 
hypothesis Ww=P+2r8+e, (i =1,...,n), (1) 
Here 2 and 28 are constants (28 is the quantum), r; is zero or an integer and e; is the error n 
observation. On this hypothesis the data will be grouped about regularly spaced means an 

the grouping will be apparent if the €; are small in comparison with ô. The alternative 
hypothesis, that the ¥; are not so grouped, but are distributed unimodall y or rectangularly: 
is called the rectangular hypothesis, 

Solutions were Suggested in the previous paper to the following problems: 

(i) estimation of f and of 2, » 

(ii) estimation of the variance of e;, 

(iii) testing whether to accept a quantum hy 
used in the test, 

Sometimes the quantum hypothesis to be te. 
tosupport it, butthe datahave actually suggest 
Says to the statistician: ‘My data appears to 
I have no independent evidence of the position 
knowledge (and my prior belief ) is neither strongly for nor strongly against such a anain 
hypothesis. Is the apparent grouping a coincidence, or does it indicate some phy A 
reality?’ This question was specifically excluded in the previous paper, and is the subject” 
of the present investigation. F 

The situation may be clarified by two analogies which show that this type of problem i 
common in statistics, The first is with the x? test of goodness of fit. Here the agreemen 
between data and hypothesis is measured by x; when the hypothesis is independent of the 
data the number of degrees of freedom used in the test equals the number of classes use r 
ata are used to estimate parameters which the hypothesis does n0 


: 8 : ists 
modification given by Fisher (1924) auaa 
simply in reducing the number of degrees of freedom used in the test by the number 


pothesis which is independent of the data 


sted is not independent of the data alleged 
ed the hypothesis. In effect the experimenter 
be grouped about regularly spaced means 
8 of these groups, and the body of scientific 


j e 
should be, the modification w e second analogy is with th 
analysis of time series, which is 


is 
lem. In periodogram analy p 
agreement with the tria] periods may be measured by the intensity S?. If the hypothes 


t 
(i.e. the trial period) is independent of the data, the significance of 2 may be tested. s 
this independence is often not the case; for example, Kendall (1946) showed that 


S. R. BRoADBENT 33 


Beveridge's 18 or 19 ‘real’ periods in the analysis of the wheat-price index, at least three- 
quarters were spurious. Kendall concludes that tests of significance in the periodogram 
remain undiscovered; Irwin (1956) and Rudra (1955) have recently discussed such problems. 

We shall for simplicity suppose # = 0 in the hypothesis (1), and we write 2d for a quantum 
Considered after examination of the data whereas 26 is the quantum in a hypothesis inde- 
pendent of the data. The statistic s?/d? (see the previous paper and (2) below) measures the 
Agreement between the data and a proposed quantum 2d. 

On the rectangular hypothesis, 5?/8? has mean }, variance 3” and is approximately 
normally distributed. A significantly small value of s?/? indicates the validity ofthe quantum 
hypothesis, The value of s?/d? may have to be considerably smaller than a conventional 
Significance point of s?/8? to validate the quantum hypothesis, since use ofthe data to suggest 
à quantum implies a low value of s?/d?, on either hypothesis. How small s?/d? may have to 


be is indicated by the experiment described below. 


uA T T T pex T T Fal 
4 
04t 
sug i Mean 
03 + 
e EE 5% level 
oa oo ae a -1°% level 
1 = 41 = 
sr 0 10 20 36 36 50 60 70 80 90 100 
e set of 20 observations uniformly distributed 


s?/d? as a function of 1/d for a singl 
Fig. 1 
: Mi T 
Consider the variation in s?/d? for a fixed set of observations as d changes. We restrict 
" . H H 5 
tention to values of d smaller than the largest observation, since the data cannot give 
ider 1/d increasing in value from 


Ormati i venient to const 
on about larger quanta. It is conve : Tan 
t larger q hen s?/d? appears as a violently oscillating 


© reciprocal west observation. T. á 
Ra inuous UN is given in Fig. 1, which shows the soi Seca of 
Ta, R (joined by straight lines) for a set of twenty observations, nineteen oi which Were 

“domly drawn from a uniform distribution between 0 and 1, 1 being taken as the twentieth 

SerVation, This sampling procedure is discussed in $3. On this figure are shown also the 
n “an and the lower 5 and 1 % significance points of s?/0? for twenty observations on the 
Ms angular hypothesis (taken from Table 3 of Broadbent, 1955). Although the observations 
te ly obeyed the rectangular hypothesis, s?/d? fell below the 5 7o point in no less than 
e een intervals for 1 [d between 1 and 100, and below the 1 36 pont in three sor vonage 
la d 80-1, s?/d? nearly reached the 0-1 % point which is 012 73. At all these values of 

^ ? data would have been conventionally judged to be significantly dise 

iom. 43 


34 Examination of a quantum hypothesis based on a single set of data 


The experimenter should realize that s*/d® may be made as small as he pleases by suitable 
choice of d; indeed, if the data are rational, s*/d? vanishes for infinitely many values of d. 
When he considers possible quanta, he is in effect looking for the minima of s?/d? as a function 
of 1/d. Small values of s°/d? at large values of 1/d are expected even on the rectangular 
hypothesis. We proceed on the assumption that the experimenter is looking for minima of 
5?|d? at small values of l/d (i.e. large values of the quantum 2d). 

To find the distribution of such minima on the rectangular hypothesis seems to be a very 
difficult problem in analysis. Conventional analysis of extreme-value problems does not 
appear to apply to the minima of such functions as s*/d?, in which the random element (the 
Observations) enters once and for all, and is thereafter treated as fixed. The distribution i$ 
accordingly investigated by sampling methods in the third section, the calculations required 
being developed in the second section. The conclusions may be of assistance in testing 
proposed quanta for significance. 

Two bases for this test should be emphasized. It is assumed that the experimenter is 
searching for possible quanta, and will propose the quantum which seems to him most 
unlikely on the rectangular hypothesis. This search for à quantum is analogous to the 
estimation of a parameter in the X? test of goodness of fit, and the test proposed is in effect 
analogous to the well-known modification to this X? test. We do not take into consideration 
; that the experimenter originally decided to search for 
ted it (just as the usual modification t° 


[s] 
B 
a 
a 
9 
S 
[n 
9 
5 
o 
5 
[c] 
S 
et 
© 
n 
s 
5 
ct 
E 
[c] 
o 
ks] 
"td 
o 
2 
d 
[c] 
4 
S 
eA 
et 
E 
[3 
et 
a 


Suppose y, y,, ..., y, 
in increasing order of magnitude; 


s*/d2 =z 2. (y; =. 2r,d)*/(nd2). 


Here r; is zero or that inte 
when y; = (2m-+1)d for m 
fro ro -+ Tn} we write r(d) 


H . a . i J 
ger which minimizes | y; — 2r;d |; this does not define r. i unique? 
Zero or an integer, and in this case we take r; = m+1. The í 

; the vector is a function of the observations and of d. 


S. R. BROADBENT 35 


We noticed in the first section that the largest value of d we need consider is d — a 
Suppose that values smaller than d* need not be considered; for example, d* might be the 
order of precision with which the observations were made. We consider the behaviour of 
r(d) and s?/d? for a fixed set of observations as d varies from y, to d*. 

The ith element of r(d) increases from m to (m + 1)as d attains the value y;/(2m +1) = dim- 
We arrange the values of dim (i = 1, 2, ...,n; m = 0, 1, ...) in decreasing order of magnitude 
from dao = y,. The interval between dim and next smallest element of the ordered set we 
call the interval [dm]; the interval is closed at dim and open at the other end. We notice that 
at d = y, the vector r(d) is (0, 0, ..., 1}; the vector remains constant for all values of d within 
any interval [d;in]; and changes in the way just described, only at the values d;,,. 

It is obvious that s?/d? is a differentiable function of d within any interval [d,,,]. At dim it 
may be shown that s?/d? is continuous but not differentiable; the term (y;—2r,d)/nd* has 
Slopes + 2/(ny;) at dim FE as e 0 (independent of m), and so has an upward-pointing cusp 
there. It follows that s?/d? cannot have a stationary value or minimum at any dim, but may 
havea maximum there. All minima of s?/d? therefore occur strictly within the intervals [d;,,] 


and may be found by equating to zero the differential of s?/d? with regard to d, in which r(d) 


I5 treated as constant. 
k 4- 9 . 
The necessary condition for a minimum 1s 


n 
2d = X yi] X rio (3) 
ii dd-l 
and here s*/d? has the value 
n n 2 n zl 
li d i=1 i=1 


] there is at most one stationary value. In the interval 
T (d)is constant, and insertion of the corresponding values ofr; in (3) gives a value ofd bib 
May or may not lie in the interval [dim] If d does lie in the interval, the eo value 
9f s*/d? is given by (4); if it does not, there is no stationary value E the ibd f - 

Itisnow possible to lay down a procedure which gives a list of all 2d E ess h ics Er 
quanta, down to any predetermined small value 2d*. The procedure is oe 3 u E Um 
foruse with mechanical methods of computing; the calculations described below we: 


out on Hollerith machines. The procedure is as 

oe Compute dim = y;/(2m 1) for i212. 
each i at the first di less than d*. 

to i ) Arrange the dim computed in (a) 
he first dim less than d*. 


It follows that in each interval [dim 


follows: 
n and m = 0, 1, .... The calculation stops 


foralliand min decreasing order of magnitude, down 


(c) Compute 5 y?. 
i=1 f à * a 
(d) Compute x riy; in each interval [dim]. In the first interval this sum is just y,; its 
iYi 
Value i fed 
t E In the interval dj, 
efore easily found in succession. 


i in the preceding interval. Its values are 
is ym larger than its value p 


(e) Compute » 12 in each interval [d;,]. I2 the first interval the sum is 1, and its value in 
i 
i y ; i ing interval 
© interval [din] is (2m + 1) larger than its value in the preceding 


36 Examination of a quantum hypothesis based on a single set of data 


(f£) Compute in each interval [dim]; from (c), (d) and (3), the value of d at which s?[d? F 
stationary, if such a value exists. If the computed value lies in the interval, it is ‘accepted’. 

(g) Ifthe value of d obtained in (f) is accepted, compute the corresponding s?/d? from (c); 
(d). (e) and (4). 

These calculations result in a list of stationary values of s?/d2, and the values of d at which 
they occur. Maxima and points of inflexion will occur in the list but can be neglected: the 
cusps which occur at d,,, do not appear in the list. If, of two minima, that corresponding to 
the smaller d is at the larger (or equal) value of s?|d?. it may generally be ignored. This 1$ 
because small values of S*|d*? are expected at small values of d even on the rectangular 
hypothesis. The list of minima may therefore generally be reduced to a list of successive 
absolute minima, in order of decreasing d. 

A short example of this calculation will now be given. Suppose five values are observed. 
1, 9, 10, 19 and 21, and that 3 is the value taken for d*. Once the di, are computed and 


t 
ordered, the y; and (2m +1) appropriate to each interval [din] may be given as in Table 1- 


Table 1. Calculation of possible estimates for a quantum 


| n | n | 
dim yi | (m+ 1) Dry | r? d | s*/d* 
| i-1 | 7! 
—| a i} A] 
| 
en | 21 | 1 21 1 (23-43) | =a 
10 19 1 40 2 12-30 | 0:294 
9 | 10 1 50 3 9-84 0-367 
7 9 1 59 4 8-33 0-370 
6-8 21 3 80 | 7 (6-15) | = 
"i 19 3 99 | 10 4-97 | 0-082 
a8 | 21 5 120 | 15 (4-10) — 
$38 | B 5 139 — | 30 3:54 | — 0:292 
K 10 3 149 23 3-30 | — 0:350 
| | 


The estimate of d given by (3) follows, the values not accepted (not lying within their 
appropriate intervals) being given in brackets. Finally, the corresponding value of s 
is given. 

The list of successive absolute minima of s?°/d? has in this case only two entries, 0-294 and 
0:032. Of the five entries in the last column of Table 1, four are near the expected value $ °” 


the rectangular hypothesis. But at d — 4:97 the low value of 0-032 has been obtained fo" 
S*|d? which indicates that the observa: 


2d, i.e. 0, 9:94 and 19-88. In this simpl 


S. R. BROADBENT 37 


3. s?|d? ON THE RECTANGULAR HYPOTHESIS 


We return to the question asked by the experimenter in the first section, and now phrase it: 
‘By studying my observations I have found a possible quantum 2d such that s?/d? seems to 
me remarkably small for such a small 1/d. Is its value so small as to cast doubt on the 
rectangular hypothesis?’ 

We attempt to answer this question by comparing the experimenter's values of s?/d? and 
l/d with values found in a sampling experiment in which the rectangular hypothesis held good. 
Two difficulties arise in making this comparison. 

The first difficulty is that points in the (s?/d2, 1/d) plane cannot at present be completely 
ordered in their departure from the expected value on the rectangular hypothesis. Suppose 
the experimenter obtains the small values sj/dj and s3/d3 at 1/d, and 1/d, and that 1/d, < 1/d,. 
If 8{/d? < sd, he will certainly consider 2d, a more likely value for a quantum than 2d; for 
the reasons already given. But if s3/d? > s3/d3, he must choose on rather intuitive grounds 
Which is the more likely quantum. For example, in Fig. 1, the successive absolute minima 
of s?/d? have been circled, and at each value below a conventional significance level the 
Corr esponding 2d might be chosen as quantum. In other words, each of the successive 
absolute minima (circled points in Fig. 1) corresponds to a possible candidate for considera- 
tion as a quantum. Is the ordinate at 1/d = 2-0 (where s?|d? = 0-22) more startling than the 
ordinate at the much larger 1/d = 80-1 (where s?|d? has the smaller value 0:13)? In the absence 
of knowledge of the distribution of such minima, and without adopting some principle of 
ordering, the question cannot be answered, and for this reason an exact test of significance 
of the rectangular hypothesis does not seem possible. However, à general comparison ofthe 
€xperimenter’s values with those obtained on the rectangular hypothesis may still be made. 

The second difficulty is that the exact distribution of successive minima on the rectangular 

Ypothesis is not known, and the sampling experiment was limited in two ways. It was 
Possible to take only a few values of n; this disadvantage is to some extent cvereeine a 
Only a rectangular parent distribution has been used. Although the distributions of s?/5° for 
& unimodal anm. a rectangular parent distribution are very similar, it may not be true to s 

he same for the distributions of successive minima of s*/d?. In the comparison ie e 

elow, the largest observation is an important statistic, and its distribution ix e ses 
*ypes of parent is very different. The comparison described is therefore only S i d. hen 
à rectangular parent distribution is a reasonable alternative to the quantum hypothesis, 


“though j illuminating in other cases. 
gh it may be illuminating in other ca go 
The kung experiment was carried out by the Mathematies Division of the National 


i r: zeen 0 and 1 were taken 
hysical Laboratory. For a given v. s between 


alue of n, (1 — 1) number 

rom the Rand random numbers, and the number 1 added to the set. In hi way sets ten 
obtained with the same distribution as that given by the following procedure: taken A om 
Numbers distributed uniformly between 0 and a (0<a <0) and divide each by ihg largest 
number taken. At first sight the two procedures appear different, but in each the ith ordered 
n to have the distribution 


Ser 2 
Tvation y; may be show 


2 (0&y; € Li = 1l, 2,0—1), 


E Ww eem 

(n— 1) " yi —Y;)" i dy; 
i-l1 

are the same in each case. Since s?/d? is dimen- 

at the largest observation in the calculations is 


ar . 
nd Similarly the joint distribution of the y; 


lo 
nless the actual value of a and the fact th 


38 Examination of a quantum. hypothesis based on a single set of data 


always 1 are irrelevant in applications. That is, the distribution of s*/d* obtained in the 
sampling experiment is directly comparable with that of s*/d? obtained when the parent 
distribution is uniform between 0 and a. 

In the sampling experiment 100 sets forn = 5 and forn = 20 observations were produced, 
and 50 sets for n=10 and for n=50. For each set, the calculation described in $2 was 
completed and a table of successive absolute minima of s*/d? was formed. The small 
value d* was 0-01, i.e. possible quanta as small as 1/50 of the range of the observations were 
considered. The size of the calculations is indicated by the number of d;,, formed: 140,000. 

For one set of each sample size a complete table of accepted d and corresponding s®/d* wes 
printed. One of these tables of stationary values of s*/d? is given in Fig. 1 as a graph of s?/@ 
against 1/d. The other three were of a simil 

(i) For each sample size n, the density of s 
constant as 1/d increases. 

(ii) The density of stationary values increases with sample size n. 

(iii) The range of the oscillations of s°|d? decreases as n increases. 


(iv) Cusps (possibly maxima) at dim have been ignored. 


ar nature. Four comments may be made: 
š : mS r 
tationary values seems to remain approximately 


have about the same average number of entries (5:5); 
i.e. there are nine successive minima in Fig. 1, but in the average sample only 5:5. The 
scatter diagrams in Figs, 2-5 Show these entries as points in the (s2/d2, 1 |d) plane. As 1/4 
increases, the values of the minima fall sharply at first and after that decrease slowly. Fig. ? 
shows that forn = 5 very small values of s?/d? are obtained even for small values of 1/d. The 
points circled in Fig. 2 are described below. It is therefore difficult (as might be expected) 
S few as five observations; startling agreement 
f the experimenter has been free to select the 
d on the rectangular hypothesis, ; 

— s?/0?) is approximately normally distributed, with 
ependent of n. Inspection of Figs. 3-5 suggests the 
} is approximately independent of » and so depends 


On the rectangular hypothesis, An. (3 
Inean zero and variance 4/45; it is ind 
following conjecture, that Vn (4 — s2Jq2 


grouped about zero or integer multiples of 2d. Fig. 6 gives 


Successive maxima of yn 3—8?/d?} for 50 sets of n = 10, 20 and 50 observations (i.e. hal 


The test of a quantum hypothesis Proposed consists in comparing the values Am — s d 
and y, /d obtained by the experi 


experiment). Tf the experim 


39 


S. R. BRoADBENT 


a" op 


GERI LL 


"| 


m 


20 


1/d 


Fig. 2. Successive minima of s?[d*: 100 sets for n 


5. 


Ss 
NT 
o 


100 


10 


| 
| 
| 


Examination of a quantum hypothesis based on a single set of data 


40 


A a | 
2 
o 
e 
o an 
- 3 
ae = " o 
- S E a S 
` Pri N 10 
tee I Il 
QUU dE a 2 g 
-^ 2 n o 5 e E 
R £ m & 
a S 
& & Bee 3 e 3 2 
p^ n £ 
D * ene 8 rd 
"id > o 
` D v e 
A D are Au 
>= i © 
Š * M 
» g 
We d E E 
mox Wo dm - o | 
ST. eg .H je E 
* x» x "o. E 
CELER o 
o 4 
: E $ 
LS p 1 wn 
N jo D 
m $ m S 
AE S 5 
5 4 E 
à o 
à P iS 
T B z R op 
* 2 „20 i Ed 
E E 
- o jo 
s = 2 
oF ow “ 4 
Me ig Le ME 9 ow 
^" Lat ur? T5. Ue eene e A, meh u 
FC datas EET ETT NE ES ies CENE 
in o y o6 w o fra} o o w o 
m m N a N - S o T Q 
ò ò ò mo ò ò ò ò ° 
* 
m 


S. R. BROADBENT 41 


isthe condition that the data must fulfil to validate the quantum hypothesis. In comparison, 
the 5, land 0-1 °% points of /n (1 — s?/8?) are about 0-49, 0-69 and 0-92, since its mean is zero 
and variance 4/45. When the experimenter is at liberty to choose the quantum, in order to 
validate his hypothesis, he must obtain agreement better than the conventional one in 


a thousand level. 


11 ] 
1 Key: 
, o n- 10 : 
x n. 20 
e n-50 
10 f 
F. . : 
1 s a d 
oor « x o a . . : 
E f Z . 
f , OM Ua E 
1 Ro n^ x ? , . . E 
2 3 id LE Li * e os 
08 es o » ; ] ue" $ Mie 
i; s? o y "a "qud! Y v a 1 og .? 
xx x 5 
E .* ? » : T p " 
"bie 3.7 49 4 4 » t 2 pu a 
vu ô > PES - v o. z S io x " js " 
& MT BECAS Yo ef e 5 . JP : 
ux . git tg ts : E " F 
~ & > 
T% [DE et ox x " nam "e 
i3 * ? o o o s 
= athe X ce ons 3 : na 
E [dos Ssedu wh ox p 
E OL m . j 
05Pe PEE Su E i " 
g , : 
sd - [TJ 
ae 
SU ut 
04 LES » Qu o^ 
ERA : | 
bere. oF 
w.-— 
uri 
oe 
di 
o 
3E* o ] 
ope | 
TN 
do | 
o4E $8 | 
a | 


EA 
ow 


[m 70 80 90 N00 


9 L L LL L 
" 19 20 30 40 50 
1/d 

maxima of Jn(j —5?/d7). 


Fig. 6. Suecessive 
ım hypothesis based on a single set of 


Tt wilt der that a quantu 
“haa appear data into two sets. The first half of the 


ar could division of the 
also be tested by a random divist i 
ma Would cuoio delia & possible quantum, and this value would be tested with the 
(end half of the data. Two difficulties arise in this treatment. The first is that two 
"sticians using this method might disagree in their judgement of the same data (because 


12 Examination of a quantum hypothesis based on a single set of data 


it had been divided in different ways), although of course their probabilities of error of the 
first and second kinds are the same. The second difficulty is that the decision, arrived at 
after inspecting all the data, to test for the existence of a quantum will entail a rough guess 
at the value of the quantum. This guess will generally influence the estimation of the 
quantum from half the data, for the estimation generally requires knowledge of the 
coefficients r, or the ability to decide which modes in the data are spurious. A quantum 
deduced in this way will generally not be independent of the second half of the data. The 
division by Prof. Thom of Druid Circles into English and Scottish circles is an example of 
the second difficulty of this treatment of data. 

In general terms, even when a quantum hypothesis is true, it is unlikely to be confirmed 
by a single set of data unless the s.p. of e; is small in relation to à and n is large. And in these 
conditions the hypothesis will hardly require formal justification. In other cases Fig. 6 
Shows that it is difficult to distinguish between a genuine quantum ‘law’ and a spurious 
quantum fitted to data. Further independent observations may then be the only way of 
strengthening belief in a quantum hypothesis. 


4. AN EXAMPLE OF A QUANTUM HYPOTHESIS 


After surveying the data of excitation energies of nuclei, Grant ( 1952) proposed the quantum 
hypothesis that the observed energy levels were integer multiples of a particular energy for 
each nucleus. Accurate measurements of two or more energy levels were available for about 
fifty nuclei; Grant gave the probability of the random occurrence of the observed integral 
Sequences as approximately 10-39, in other words he believed the dat 
in favour of the hypothesis. 

The five largest numbers of observations made on individual nuclei were with n = 7, $ 
10, 10 and 12; the observations} (y,) are given in Table 2. The quanta proposed by Grant 
are based on inspection of the data. Making use of the vectors r(d) which follow from these 
quanta, estimates of 2d were recalculated by (3) and the corresponding s?/d? were calculated 


by (4). Itwillnowbe considered whether the data of Table 2 tend to disprove the rectangular 
hypothesis. 


We first notice that the agreement between columns y; and 2r,d for each nucleus appear? 


intuitively good, and the values of s?/d? calculated appear to be well below 1. However, tw? 
of these values are certainly not significant (for ssKr and 44N), i.e. for these observations : 
Vn{h—s?/d} is below the 5 % level of Ana — 82/02) on the 
lly be judged significant, 
nted on a graph similar to Fig. 6. On this figure are draw? 
% points for the minim? 


H 7 
a were overwhelmingly 


rectangular hypothesis. The 


f Data privately communicated to the author, 


S. R. BROADBENT 


43 


dot d meos already given, the remainder of the data is not sufficiently numerous to throw 
"piis 15 — hypothesis: Tor example, the seven sets of n = 5 observations gave 
were obtai SES e in Fig. 2 on the (s?/d2, 1/d) diagram. Although some low values of s?/d? 

e obtained, it was hardly possible to get values lower than those found on the rectangular 


hypothesis. 
12 ee a nn | 
4° 214 " 
1 84 Ra 49/ 
10 A | 
0:9 5% 
08 Mean 
E 07 
^5 06 
Los HY 
X 04 
03 2 
02 4 
04 4 
10 20 30 40 50 60 70 8090100 120 140 160 
1/d 110 130 150 
Fig. 7 
Table 2. Energy levels of five nuclei (Grant's data) 
| | T 
Nucleus pi Xr | HN | 4B 430 *4RaC’ 
— | ^ RT ' 
No. of 
Observations ... 7 8 10 10 12 
| | | | | | 
O | Wi 2rd | y; | 2nd | Yi | 2r;d | y 2r;d u | ord 
bserved and 0-550 | 0:542 | 5-276 | 5:338 | 2-141 3.144 | 0-875 | 0:635 | 0-608 | 0-612 
fitted energy 0-610 | 0-619 5:305 50338 | 4457 4-467 | 3-02 | 3-176 | 1-283 | 1-286 
levels (MeV.) 0.889 | 0-697 | 6-328 | 6322 | 5-033 | 5-003 | 388 | 3811 | 1-412 | 1-409 
0-768 | 0-774 | 7-164 | 7-164 | 6-752 | 6-789 | 4-54 | 4-446 | 1-663 | 1-653 
I 
0-825 | 0-851 | 7:309 | 7.305 | 6-802 | 6-789 | 5'17 | 5-081 | 1-844 1:837 
1-038 | 1-006 8-315 , 8:288 7-297 | 7:325 | 5°72 5-716 | 2-015 | 2-021 
1:315 i316 | 9156 | 931 | $565 | 8-576 | 6-33 | 6352 | 2-138 | 2:143 
Md "^ l10816 |10817 | $921 9.934 | 6-93 | 6-987 | 2:268 | 2-266 
| | 
Mod = [gee pae | 9.112 | 7-60 | 7622 | 2-439 | 2-450 
=f LI] Sees 257 | 2-513 | 2511 
D AL z nece | 2-697 | 2.695 
- Ae oth] 1. 2.880 | 2-878 
e. deer a 
2g 7867 0-635185 š 
(077: 0.14048 | 0-17867 5 0-06124 
TA M en magia | Dee | Rrh 0-0331 
Vn (4 — sign 3887 o4867 | _ 06844 0-7194 1-0401 
pn š i 1540 103-7 25-9 9441 
Stimate of o zn 0.029 | 003 0-104 0-006 
9ported g- 0-003 0-006 0-008 | 0-020 0-006 


44 Examination of a quantum hypothesis based on a single set of data 


By itself the nucleus ?! Ra C’ might be considered to validate a quantum hypothesis, but 
as a whole that data must be thought of as consistent with the rectangular hypothesis. 


Part of this paper was written at the Atomic Energy Research Establishment, Harwell, 
during a vacation consultancy, and I am indebted to the Establishment, particularly to 
Dr J. Howlett, for supporting the work of calculation. I am indebted to Mr C. W. Nott, of 
the Mathematics Division of the N ational Physical Laboratory, who carried out the calcula- 
tions on the sampling experiment. The paper has benefited from my discussions with 


Prof. G. A. Barnard, and is published by permission of the British Coal Utilization Research 
Association. 


REFERENCES 
BroapBenrt, S. R. (1955). Biometrika, 42, 45. 
FISHER, R. A. (1924). J. R. Statist. Soc. 87, 442, 
Grant, P. J. (1952). Proc, Phys. Soc. Lond. A, 65, 150. 
Irwiy, J. O. (1956). J. R. Statist. Soc. B (in the Press). 
KENDALL, M. G. (1946). The Advanced Theory of Statistics, 2, 435. 
Rupra, A. (1955). Sankhyā, 15, 9. 
Tuom, A. (1955). J. R. Statist. Soc. A, 118, 275. 


[ 45 ] 


THE NUMBER OF NEW SPECIES, AND THE INCREASE IN 
POPULATION COVERAGE, WHEN A SAMPLE IS INCREASED 


Bv I. J. GOOD ax» G. H. TOULMIN 


A sample of size N is drawn at random from a population of animals of various species. 
Methods are given for estimating. knowing only the contents of this sample, the number of 
Species which will be represented r times in a second sample of size AN; these also enable us 
to estimate the number of different species and the proportion of the whole population 
represented in the second sample. A formula is found for the variance of the estimate; when 
A> 2, this variance becomes in general very large, so that the estimate is useless without 
some modification. This difficulty can be partly overcome, at least for À< 5, by using Euler's 
method with a suitable parameter or the methods described by Shanks (1955) to hasten the 
Convergence of the series by which the estimate is expressed. The methods are applied to 
Samples of words from Qur M ulual Friend, to an entomological sample, and to a sample of 


nouns from Macaulay's essay on Bacon. 


1. INTRODUCTION 
We present here a further development of the theory expounded by Good (1953); that paper 


Will be referred to, for brevity, by the letter G throughout. 
We imagine a random sample of size N, the basic sample, to be drawn from an infinite 


Population of animals of various species, and suppose that n, distinct species are each 


represented exactly r times in the sample, so that 
© 
X: Th, = N. (1) 
r=1 
co 
We write d= DF te 
r=1 


the total number of distinet species in the sample. It is convenient (though, as was pointed 
Out in G, not essential) to suppose that the total number of distinct species in the population 


is a known finite number s, so that we can calculate 

n = 8—d, (2) 
d in the sample. If the actual value of s is not known, 
arily assumed to be any sufficiently large number. 


Sing (p. 237), the larger n. is, the more applicable our results are. In G it was shown that 
x SOK) aes vant 
pertain properties of the population could be deduced approximately from the sample 
Tequencies n: in particular, the total coverage of the sample (i.e. the proportion of the 
Population Sapna ented in the sample, which is the sum of the population frequencies p la of 
€ species represented) is approximately 
En) a 3 
as (3) 


th 
i, number of species not represente 
a epa 
Our results will remain true if it is arbitr 


Tovided n, is large (G, formula (9)).T 


when our basic sample of N specimens is 


lom variable n, s 
this random variable and for a particular 


T en) ; ant 
taken 1) is the expected value of the ran ise bots for 


Va] at random. We shall use the same sy™ 
Ue of it, 


46 The number of new species covered when a sample is increased 


i 
We now contemplate taking a second sample, of size AN. We describe this as the ‘second 
sample’, even though it may be (and in practice probably will be) an enlargement of the | 
basic sample; in this case, of course, Az 1. If the second sample is not an enlargement of the 
basic sample, it will be termed independent; this word may be interpreted in its probabilistic 
sense. provided that the true statistical hypothesis specifying the population frequencies 
is momentarily regarded as ‘given’. Except in $4 our results apply to both enlargements 
and independent second samples. 

We may now wish, for example: 

(a) To find the expected coverage of the second sample. 

(b) To find the expected number of distinct species in the second sample. 

(c) To find (roughly) the variances of estimates of population parameters which might be 
made from the second sample. 

(d) To estimate the term Ga (nts, | H) in formula (22) of G for the variance of Nps 

Results of this type may enable us to decide whether itis worth enlarging our sample, and 
to what extent, depending on the purposes for which it is required. 

For example, consider a teacher of languages who wishes to base his teaching on the 
population frequencies of words. He will wish to estimate what size of vocabulary should 
be learnt by a student in order to decrease the need for reference to a dictionary below 
a certain frequency. It was shown in G how a sample can be used to make such an estimate. 
The present paper shows in what way the sample can also be used in order to help the 
decision of whether to carry out more sampling. For instance, in example (iii) of $6 below; 
the 2048 words of the basic sample had an expected coverage of 87-3 %, and we find that if 
the sample size were doubled, then the Same expected coverage could be obtained by 
the teacher means less for the student. 


application. 


Let n,(A) be the random variable whose value is the number of distinct species represented 
exactly r times in the second sample.t 
We first consider a method that may appeal to statisticians who are accustomed to fit 
distributions by the method of moments, The method will not, however, be used in Me 
examples if only because of the enormous amount of calculation that it requires. We beg!” 

_ by stating a lemma that is presumably well known, although we cannot give a reference. 


Lemma. (Determination of a set of numbers whose ‘factorial moments’ are specified.) 5 


b= Sa, (i= 0,1,2,...), (4) 
r=0 
then = LS (Hy (5) 
dni: mr me 


at anyi rate ifa, = 0 for all sufficiently large r.$ Problems (a), (b) and (d) above reduce to Mw 
estimation of the numbers & (n,(A)) for certain values of r when values of n,(1) = "'r ar 


T i is Oonvesiant hereto depart slightly om the notation of G. The number which we write 9 
e &(n,(A)) would there have been denoted by an(n). Note that in the case of an enlarged samp. d 
lA) is to be considered as varying as the whole enlarged sample is varied at random, not merely * 
5144 Z BN additional specimens, so that this correspondence of notation still holds 
EE For à"proof under more general conditions, see the Appendix p. 62 below. 


vy 


| Parameter q is convenient. We have no 


I. J. Goop and G. H. TOULMIN 47 


observed; the results are given in equations (22), (23) and (32), respectively. The same is 
true of problem (c) when the variances concerned can be expressed in terms of the n, (À): 
thus equations (30A) and (31) of G show that this is the case for Yule's ‘characteristic’, 


s 
which is an estimate of X p?. 
pal 


Now EN rm 
NO r 
. s * 5 H . 
1s an unbiased estimate of c; = Y, pj (see, for example, G, p. 245). Similarly 
pl 
[ 1 vg ! voi (6) 
———— M un (A) m S A. 
4 ajo, 2, o 29s 


It may therefore seem reasonable to assume 


E AN)? e 2m k 
bA Poss A) = J, rn, | =A 2 rn, 5 (7) 
P. r=0 r= 


r=0 

and then to solve for & (n,(A)) by using the lemma. pa 

In spite of the theoretical interest of this method it seems likely that it is not really 

adequate. For it depends too much on the estimates of the higher population moments, [n 

and these estimates are subject to large sampling errors. We have therefore not investigated 

any numerical examples. Instead of proceeding via the factorial moments, we find directly 
the following relation between &(n,(A)) and the &(n,): 


ee , (rae ae 
£e) X, (I, JAWS) (8) 
for any integer r > 0 (82, equation (16)). If we assume that 
(n us (9) 
ii E (hpi) hou (10) 


ri i smoothing the numbers 7}, no, ng, ... (see G, 
Where the numbers ni, 115; 2g» ... Are obtained by g 4» Ng; Ng, 


83, 7, 8), we can estimate the values of &(n,(A)). Tv w 
We point out here that the series (8) is not really infinite: for 


Elni) = 9 whenever 7+i>W, (11) 
TT 


sae ie replaced by N —r. If à> 2, a practical 
and the imi mation could therefore be rep: i yN 
difficulty are qe factor (A — 1)? increases rapidly with i, and so attaches great 


i is li T tage error 
We P i Il and therefore is liable to a large percentag 
ut to terms for which dim) inen s to be practicable to overcome this difficulty, 


When estimated the basic sample. It seem: 
oe Past, for area values of A (say A<5), by using a pe aa aha 
Series (8) converge rapidly; it is shown in §5 that Euler's met ox wi s y ue 
: t. however, been able to justify this procedure by 
sums of the new series obtained. 

investigated practically. First, it 


stages: e.g. to estimate & 


ing a useful error term for the partial 


ibiliti i e have not 
9 mention here two possibilities which w 
i of Ain two (or more) 


ae 


Sy be possible to reach larger values i LOWE S 
; " rst estimate &(n,(2)), 4 (nF, \- 

B ny : : rectly with A =4, first es 1 € NO 
Might, instead of using (8) direct y as Ww. a us 

KC o S 
QUE Sy à. Ue 
s A n! a DE "d 
X oqvs EG Wr di 


(Dur hm 


48 The number of new species covered when a sample is increased 


é (n4(2)), ..., then smooth the values obtained, and again apply (8) with A = 2 to estimate 
é (n,(4)). Secondly, (8) with 1/A in place of A and n, 1, (A) interchanged might be used as 
a check on the results obtained, and this might provide a new method of smoothing. 


We are much indebted to Dr J. Wishart for several suggestions and corrections. 


2. ESTIMATION OF E (n,(A)) 


Let p,(u = 1,2, ...,8) be the population frequencies of the s species. As in G, equation (10), 
8 ANY 
Fnj- E ( Jia - nv. (12) 
#=1 
(In G, the left-hand side is written as 6 v(n,| H). As explained above, we use the symbol 2, 
only with reference to the basic sample, of size N; we omit the H, which refers to the 
hypothesis that the population frequencies are {P,,}, because we shall not be concerned with 
expectations on any other hypothesis.) For the second sample, we have similarly, assuming 
p, « $ for all y, 


8 


£e)» X (P2) sra cp ga 


8 [AN b A-1) 
=} ( 5 Jota nor (ie e 
4-1 D, 
s (AN E. 2 f ed ya 
EY "(1—5 yN-r v i = £ 
ei ( r ) Pill Px) = ( i ) p, —9 (13) 
2 AM (-(A-1)N\ = _ 
= 5 A P «Hmc Nirti 
i=0 r t t Vz Z ü Py) : i 


rs p Y^ "x E (n,,;). (14) 


(14) is not rigorously correct, since for r+i> N, es j and &(n,,;) both vanish, and the 


corresponding terms of the series are indeterminate. We notice, however, that if the infinite 
upper limit for 7 is replaced by an odd [even] integer, the left-hand side of (13) is greater 
[less] than the right-hand sidet, and the same therefore holds of ( 14). Thus the partial sums 
of (14) are alternately greater and less than the left-hand side, and in all practical examples 
a sufficiently good approximation is reached while (r--i) is still small compared to N. 
Provided that we use only terms of the series for which r+i<N and i< (A— 1) N, we can 


write 
e oe 
à i a ANY (-A-1) Ny (ri) 
( N B lil Net ~ 
r+i LEE 3 
= ray (719. (15) 
Hence Eln A) S (=1j gy (à — 1 &(n,,.) (16) 
i=0 r rti: 


the partial sums erring alternately in excess and defect. 


T This follows from the nth Mean Value "Theorem applied to (14-2)-à-0 x, 


I. J. Goop and G. H. TOULMIN 49 


(16) can also be obtained directly by using the Poisson approximation 


6(n,)= Y exo, B (17) 
r [x r! 
We define an estimate of & (n,(A)) by : 


RA) = ar S (= ve (A — nya; (18) 
i=0 
then (16) gives us EAA =E(n,(A)). 


For the case 7 = 0, we may seem to need to assume the value of s, but this assumption is 


not really required since we can write 


da) = d- X (= DA- Dim = s= îl), a9) 
i=1 
80 that &(d(A))=E(U(A)). 


We have thus obtained (approximately) unbiased estimates of &(n,(A)) and &(d (A) in vw 
of the observed numbers n,. We shall almost certainly obtain more accurate estimates if we 


A £3 7. 8 
replace the n by smoothed values n;; for methods of smoothing see G, $83, 7, 8. 
; an enlargement of the basic one, and it is desired 


In > 
the case when the second sample is mepi re bar aah tcd 


to predict tl iv ic sample rather than 6 (n, 
he value of n,(A) given the basic samp i ) 
to be the expectation when the whole of the second sample is varied at random, it seems 


Intuitively clear that n, should not be replaced by n; in (18), at any rate iin is not in 
though the later terms probably should be smoothed; for instance, consider t pre s , 
"€ have not attempted a rigorous treatment of this question. ex mime d á wx g 
55 not as great when using formulae like (18) as when using such formu 2"): 


rz (r + 1) npa 


a ratio is involved surreptitiously, as in G (6), (6^). 


involyi : s ; 
olving a ratio of the nj. Sometimes Cae D cesa pna 


n important point in the argument leadin, 
E 
( 1 + fe} 5 


e as functions of the &(n,). This device can also be 
lines 8 and 9 of p. 241) of replacing the expected 
lue for a sample of size N. The 


" E ^t expansion led to terms heec 
e avoid th ¿imation made in 
e approximatior i i 1 
EE netm for a sample of size N +m by its expected v 


Tes H 
“Sult obtained is (using the notation of G) 

qam 1$ (rem (7 glam (20) 
OG? | H) = (N =r) Eim) i20 (N-r-mp i 


(rm 1)m ein TN 
( yo? E (nim) —^7N-r-m (Tema oe 
r+m : 
= (Nro d) 
i yon) by N™ and 
;valent to replacing (N —7) 
*Pproximations made in G are thus equiva heat 
rst; are reasonable prov 
“cting the terms of the sum after the first; they ae 
4 


The 
Neg] 


50 The number of new species covered when a sample is increased. 


As noted in $1, we may be particularly interested in the coverage of the second sample, 
and the number of different species it will contain. By (3) and (16) with r = 1, the expected 
coverage is approximately 


EmA) 1 


1 AN Paap E (7D EHL (A-1) Eln) 


1 2 
= 1-5 [m4 - 20. — 1) n, + 3A — Ln, ...] (22) 


(or, more accurately, the same formula with n, in place of n,).t The expected number of 
distinct species represented is (by (19)) approximately 


d (A—1)n, —6 (A — 1)? n, 4- ...; (23) 
i.e. in the case of an enlarged sample, the number of new species expected is approximately 
(Ài-1)n - (A — 1n, 4..... (24) 


Evidently n,,n,, ... may be replaced by smoothed values in (23) and (24), but d should be 
replaced by the smoothed value j bm 

d' =m +n... i 
only in the case of an independent second sample. Note that (23) and (24) can be proved 
directly without assuming s to be finite. 


3. VARIANCE OF THE ESTIMATES f, (A) 


In this section we find an expression for the variance of the estimate %,(A) of &(n,(A)) defined 


by (18). This must not be confused with the variance of n,(A), which can be found from the 


formulae given in G, $5. %,(A) is a linear function of the random variables n,,;, and varie? 


TM 
; 2 N È e 
accordingly when we take different basic samples; we can find its variance if we know th 
variances and covariances of th 


en,. We therefore start by calculating these. By the metho 
of G, $5, we find 


N! peo 
Eln, n) = 4,6 (n,) EENETI X p =p; -p 
Hv 
N! 
= ô, Eln, JE PAGE L——— Cala — N-r-s.. r+3(] _ 2. N-r-8], 
se (n, rst Qy psy 2 Epp D,—p,) ED; (1—2p,) " 
where à,, = 1, ôs = Oif rz s, 
Now 
(1— p, — p,- = [a —p,)(1—p,) ( — mr 
1—p,1—p, E 
= (1— p, N= (1 iJ Ns Py y( By Pr ) 
i2 ( Mes (1—p,) (14 2 1 I-E-5 
a r 
=% () p(l- paS () pi(1 — p,)N-s-i 
geb j-o V i 
N-r-s (26) | 


Ne 4 
< E, CP (T pasate - 27 


i y š one’s 
T It is correct to replace by ni, even when the second sample is an enlargement of the basic Sed 
because the more accurate formula for the coverage, G (9^), uses nj in place of n,, so we are intere 


d value of 74(A) given the basic sample. 


I. J. Goop anp Q. H. TOULMIN 51 


4 N-r-s 
and (1—2p,) T3 = [a —P,) ( E x 
N-r-s iif ^ : e 
-ECS(UITsa m em 


Substituting from (26) and (27) in (25), 


Emad = iuf rrr OO 


i,j,k 
x (xu =p sa — posit) 


" 
—$Si-àiy * -E- ? xoa =p] 
i i = ; 


() () lg re E (n, iia) 6 Qi) 


N! i 
7 E: 1 
$6 (My) SO ra)! PA ) N )( = ) 
r+i+k/ \stj+k 
mi 


-X(-D eA fns) 
| eir 
—r—i—k)!(N-s-j-k)!(r+i+k)!(s+j+k)! 
(NV —r—s- E) N!!! k! (5-0! 073)! 
(rs i)! 


x Ê (ny iia) E (nii) — x (-1) TID E (n, si) (28) 


N -r—i- BU ej BL rei E) (8 +5 +8)! 


= 6,,6(n,)+ X ca 


i,j,k 


= 8,,8(n) + E (Waor—s— BU HEUS 6-3 (n7)! 
i (+9) gin, (2 (29) 
x E (ny cis) &(nsijs4) — a s Fal (n,..,(2)); 


ient i first sum is O((rs[.N )'*?**); 
p rovi ji, j, er ll <N, the coefficient in the : ) 
and eem rear : A p: preme formula shows that it is 1+O(rs/N). Hence, if 
=j=k=0, 
“ae = & (nm) — £ (m) £ (n) 


cov (Nps T) xr , 
26,56 (Mp) = 27 ( P ) (n... (2))- (30) 


Notice that when = s we have equation (22) 


yapasa) -2 (7) 600» v 

or, €Xpandin by (8) 
g the second term by (9/; mu 4 
rejesi gE De ta red aby 


Fo 
BBs Case r = 0, since s is constant, we have 


^ 33 
V(d) = Vin ye-é(d(2)) - 46) (ts) - Fa) * (33) 
= V(no | 
Using (30 riance of #,(A). From the elementary formula for the 
Y ), we can now find the v 


ari d 
ance of a linear form: VSG) = ¥ a; a; cov (2; ti) 
D os ij 
a 


52 The number of new species covered when a sample is increased 
we have 


V(R,(A)) 


( — 1) (a — 15i i ' des COV (n,.;. npag) 


(=I) (A= 1f 29 L2] Z di ge y Fersi e) 


i 0 d "e 
= AÀ? [ X a-re y ) 40a 
2 eee 
2 2n , "ee 2, ae 
~ 2 (-WA-1y2 s ! (ns... (2) riri ai ilj! 
- TEC z a, 2r ID)! 2) 
=a] $ asuri) Pnad E ape- E aalan] 
i=0 - 
2 [r +i , ZRN snc: " (34) 
s| $ a-o (t) Etn) = (77) 2A) (mg (2), 
i=0 i á 


using (8) again. This derivation of (34) is slightly unsatisfactory, since the second series 2 
the previous formula may give a good approximation to the term by which wa mpm 
only after so many terms have been taken that the approximation made in (30) is no ea i 
valid. It may be possible to postpone making this approximation until a later stage of zd 
calculation, but the algebra would become very heavy. In order to estimate & (no, (24) E 
calculating (34) for an actual case, it may be possible to use the method of $2 (probably ! 
conjunction with the summation technique described in $5) 
a sufficiently accurate guess. 

If f,(A) is defined by (18) with n, in place of 7,+; it becomes very difficult to make ani 
estimate of its variance. We can say, however, that so long as we feel that it is worth usinÉ 


rae n- 
smoothed values at all, the variance of the estimate based on them is likely to be co 
siderably less than that given by (34). 


; ke 
; Or it may be easier to ma 


4. VARIANCE oF 2,(A) CONSIDERED AS A PREDICTION OF 7,(A) 


s ; ^ , d to 
In the last Section, we were considering the question: How much may $,(A) be expecte' 


differ from its mean value (which is equal to E (n,(A)))? A question which may ipee 
more relevant is: How much may fi,(A) be expected to differ from the value of n,(A) obta 
in a random second sample? To answer this question, we want to find 


(35) 
VRA) —n,(A)), 

which may be ealled the variance of %,( 
^A), rather than as an estimate of the 


to consider Separately the cases when t] 
enlargement of the basic sample. 


ables 
When the second sample is independent, 7,(A) and n,(A) are inde pendent random varla 


d 36) 
"USD 96,0) 7,09) = VRA) + Vin (A) l 
which can be calculated by using (34) 


, Du 
" TN " ıriab 
A) considered as a prediction of the random va dit 
i ; essa. 
parameter &(n,(A)). Tt is evidently now i: ai 
ne second sample is independent, and when i 


and the following modification of (31): 


T 31) 
V(n,(A)) 4 (n,()) — 2-2 C) E (ny (22)). 


r 


I. J. Goop AND G. H. TOULMIN 53 


In the case when the second sample is an enlargement, %,(A) and 7,(A) are correlated, and 
we have not been able to calculate (35) in this case. It may be expected to be considerably 
smaller than in the case of independence. at least if A is not large, since when A = 1 we hava 
N,(1) = n, = n,(1) and so (35) is reduced to zero. 


5. SUMMATION OF THE SERIES OBTAINED IN §§2 AND 3 
We consider the case of the general series (18); similar remarks apply to (22), (23), (24) and 
(32), and to the series arising in calculating (34) and (37). It was pointed out in $1 that the 
term (A — 1)! in (8) and (18) may cause trouble if A> 2. In fact, the series is likely to become 
practically divergent’; i.e. to behave like an infinitely oscillating series up to a point at 
Which we become too uncertain of the value of &(n,) to continue with the caleulation. This 
difficulty is illustrated by formula (34) for the variance of %,(A); if A> 2, the series 
X aL D? E (Mp2) 
i=0 \ T 
) decreases extremely fast. It is natural to 
d of summation which is known to make 
ethod appears to be that of Euler, with 
1949, pp. 178ff.). This is to 


ìs likely to have a very large sum, unless £ (+i 
try to overcome this difficulty by using a metho 
Some oscillating series converge: à convenient m 
* parameter q, generally called the (E.g) method (Hardy, 


transfor a ee S al. hen 
m the series 2 a; into p af. W here 


s=: Jel : E 
AF T M 38 
= says S7 " 2 
ja. . a; 


the forward difference symbol Ai being defined inductively by 


Ai = AMA. (40) 


Ala; = psy — 0p 

(The form (39) is given by Bromwich (1926), pp. 62-6, for the case q — 1. It leads to a con- 
Venient method of setting out the work in a practical example, which will be illustrated in 
: Xample (i) of the next section.) If X; a; converges, then x a? converges to the same sum, 
for i T 1 "ai y 
ranyg> 0;ify a? converges, then X a® converges to the same sum for all q >q' (Hardy, 


19 j ? 
m Theorems 117 and 118). 
N practical examples, n,.,; generally deer 


Us S T : 
ually interested in small values of r, so that ( ^ ) inc 
nearly a @.P. with ratio — (A— 1). Now, 


eases slowly after the first few terms, and we are 


reases slowly. Under these circum- 


Sta; 
ples, the series (18) is, after the first few terms, 


We apply the (E, q) method to such a G-P., say 
a; = (-1Y A - 1 (41) 
O TE 
tai , w y (2)]gi(- 0 —1) 
in a® = TE iy P q 


[ose (q-(A- 1)? 


(q- E177 


E a (azy, 


(42) 


54 The number of new species covered when a sample is increased 


. g—(A-1) 4 
i.e. the transformed series is a G.P. with ratio COS. Clearly the best value of q to select 


is A— 1, which reduces all but the first term of the transformed series to zero.+ If the "pui 
decrease fairly rapidly, we may get better results by choosing q somewhat smaller. (This 
is the case in Example (i) of $6, where A — 5, but we take q = 2. When r = 0 or 1, and if 
0, > Na, it may be worth taking out the first term of the series, and applying the summation 
process to the remainder; the reason for this can be seen by considering such a series as 


1—$+4-$+4.... (43) 
If we apply the (E, 2) method directly, we get 
0-333 + 0-208 + 0-139 + 0-093 + ..., (44) 
while if we take out the first term and apply the (E, 2) method to the remainder, we get 
1-000 — 0-042 + 0-000 + 0-000 + ..., (45) 


which is evidently better. 

When we have chosen a method of summation, and selected a partial sum of the trans- 
formed series as probably giving a sufficiently good approximation to the final sum, we can 
express this partial sum as a linear combination of the n,, and deduce its variance, as in 83.1 
But there is now a new source of error, namely, the omission of the rest of the transformed 
series. We have not been able to find a useful form of error term for this remainder (corre- 
sponding to the statement that alternate partial sums of (8) err in excess and defect); failing 
such an error term, our results must be used with caution when it is necessary to apply tbe 
summation process. If the n,’s decrease slowly and q is taken to be slightly smaller than 
À—1, the transformed series will generally have terms alternating in sign (cf. equation 
(42)); it might then be hoped that the partial sums err alternately in excess and defect, but 
it does not seem to be possible to lay down any simple general conditions under which this 
is the case. 

Some of the methods described by Shanks (19 
case; they have the propert: 
point onwards, so that the 


55) also seem to be very well suited to 0T 
y of summing perfectly any series which is geometric from som? 
difficulty caused by an excessively large first term, noted above: 


: . H = 
does not arise. Given the series = a, we define a sequence (not a new series) by 
n=0 


i r=0 0543 — Ay, 
repetition of the process gives a sequence C,, and so on. The e, method consists of considerinÉ 


: f 
the sequence B, (in place of the sequence of partial sums A, = » a,), the e method, ? 
r=0 


t H e 
considering the sequence Cn, and so on; the & method consists of considering the sequen? 
Ao, By, ,,.... For an example, see §6, Example (i). 


T This statement may appear to conflict wi 
increases, the (E, q) methods form a scale of in 
whether we obtain & convergent series or not: 
obtain a convergent series, but it will converge 

i This remark applies to any method of s 
e.g. Cesàro means, the composite (E, q; 
method, any Nórlund means, or quasi 
see Hardy (1949), p. 392. It does no 
below. 


th the remark of Hardy (1949), p. 180, that ‘8 4 
creasing strength’. But here ‘strength’ refers only nly 
if we choose g unnecessarily large, we shall certai 

very slowly. T 
ummation by a linear transformation of the RS 
C, k) method, any Hausdorff means, Hólder means, Hutte, 
-Hausdorff transformations; for references to all these metho d 


‘one 
t apply to the non-linear methods of Shanks (1955), mentio? 


I. J. Goop and G. H. TOULMIN 55 


6. EXAMPLES 


The first example is an artificial one designed to test the efficacy of the methods described 
raah especially the summation methods of §5. The second and third examples illustrate 
e practical applications, but enlarged samples are not available for verifying the estimates. 
Example (i). Sample of words from ‘Our Mutual Friend’ by Charles Dickens. The following 


Samples were taken: 


A, of 1000 words, the last words of lines on pages= 5 mod 25, 


B, of 2000 words, the last words of lines on pages= 10 or 20 mod 25, 


C, of 2000 words, the last words of lines on pages= 15 or 25 mod 25, 


ed to make up the prescribed number 


the sampling in each case being carried as far as requin 
esample (N = 1000) and to calculate 


[n i eMe z ; 
f words. Our original intention was to use Aas the basi! 


Table 1 
Sample 4; N = 1000 Sample 4; N= 1000 

r Np n; r Ny n; 
1 404 404 6 3 — 
2 51 64 T 0 = 
3 24 25 8 3 E 
4 16 12-2 z9 15 Ex 
5 6 6:3 

d= 528 


(19) and (18) for A = 2, 3, 4, 5, which could be checked 
actually obtained from the samples B, A+B, B+C, 


+B40, The rauto however, showed a systematic and, for d(2), significant difference 


bet, ing back from sample B with A = $ 
à een t et red result. Working back ir! p D 
Se e cnan Oe posi o small. We believe that this is due to the 


greed th iderably to 
at sample A had n, considerably i : 
act that the Mire of saangling used was not sufficiently random; an uncommon w ord 
8 kely to occur several times on the same page, where à particular topic 1s discussed, and 
ina e Word is therefore less likely to occur just once in a sample selected as described than 
a 
random sample of the same size-T 
© results for larger values of A Were; 
umm ae we give the calculation of d(5) as an € 
Mos described in $5. Table 1 shows the 
hing of An, Our formula (19) gives us 


d(5) = 498: LO a ie 


Y 5 
m of d(A) and %,(A) given by 
inst the values of d(A) and n(A) 


not much less accurate than those for 
e use of the (E, q) method of 
were obtained by graphical 


however, 
xample of th 


n data; the ny 


43.12-2 4-45. 6:2— ...). (46) 


», p. 2 ‘Two two two...’, and so on. 


T Coni 
one one.. 
nsider the extreme case when p. 1 reads “one 


56 The number of new species covered when a sample is increased 


To transform the bracketed series we form the difference table suggested by (39) (q has 
been chosen as 2, so as to make the differences small): 


1.4.404 =808 — 552 
2 2168 = OF 06 
(4) 64 =256 _56 496 — 445 
133.49 o5 5 2 
(19.49.35 2200 _5 51 —43 10 
1 44 12.22 8 aay 
(4)*.44.12-2=195 E. —388 
1) 2 +198 ze 5b 
(1)5.45.6 198 -rm 995 


(We apply the usual check, that the sum of each column is equal to the difference between 
the top and bottom of the one before.) The transformed series, by (39), is 


§-808 + (3)*.552-- ($^. 496+ (3). 445+. (2)5.402...—538 + 245 + 1474-88 £53... (47) 


The last few terms of (47) are approximately a geometric series with ratio 0-6; the sum of 
the remaining terms should therefore be approximately} 


making a total of 1150; hence 
d(5) = 528 +1150 = 1678. (48) 


2. 
at the end of $5, we get Table ^ 
quences are rather short, it looks as if C, = 1155 is a go? 


giving d(5) = 1683. In fact, for the whole sample A B4 O: 


Applying the methods of Shanks (1955), described 
Although the transformed se 
approximation to the limit, 
d(5) — 1832. 


Table 2 

| m! 
n A, | B, On 
0 1616 
1 592 1216 
3 2192 1134 1155 
3 —931 | 1162 
4 5418 | 

| 


In this example, we have been sli 
ilable; when using the (EZ, k) method 


sample B (N — 2000) as the basic sam 


although we can then verify the results only up to A = 2.5, (The ‘second samples of 


A = 1-5, 2-0, 2-5 are A +B,B+0,A+B+0 respectively, and are thus all enlargement, q 
the data for this basic sample; the n! were pro i 


T This is, as Shanks (1955) points out, equivalent to applying the €, method to sum (47) 


I. J. Goop axd G. H. TOULMIN 57 


E ru Vn. graphically by the use of French curves, the »;, independently. by 
Eu en ki : Ah 6). and the estimated percentage coverage, 100 (1 —A,(A)/AN), 
E ve = or A = 1:5, 2-0, 26, using the three sets of values. The summation process 
Sr iones: T n the case A = 2-5, with q = 1. Table 4 shows the three sets of estimates 
dera, a — found in the enlarged samples; standard deviations are given where 
fale on i E culated from (31), (33). and (34). It will be noticed that in this case 
o: not ning was gained when smoothed values were used; but it would probably 
essential to use smoothed values when working with larger values of A. 


. 
Table 3 
Sample B; N = 2000 
| , 
T n, nr nr 
1 729 729 729 
2 108 96 110 
3 33 38 38 
4 23 21 19 
5 17 14 13 
6 7 9 9 
7 5 T 6 
8 3 32 E 
29 30 — = 
i 
d= 955 


ht-trap at Rothamsted. (Quoted as 
er & Williams (1943).) N = 15609, 
G, obtained by 


doptera in a lig 
a in Corbet, Fish 
n. is n7 of the example in 


given by Hs of G with parameter 


E. " 
"a tample (ii). Captures of Macrolepi 
i (i) in G, $8 from Williams's dat 
= 240. Table 5 shows the small values of r. 
smoothing, 


Sm E T . 
Oothing Y (m, n? is Fisher's analytic 
stribution of the population frequencies 


B= c . n 
(p 402. Now H, is a hypothesis defining the di 
jJ and it implies that 


NA Y 
anon! (sara) . (49) 
*s PSI. abd saan = Pon s + 1) - 
r (67). Since N » f, we see that Hs implies 
and Em (A) 260) (51) 
&(d(A)) d + f log,A- (52) 


Putt: 
"ing A = 2 we see that doubling the sample will approximately halve the proportion of 
the number of distinct species 


e : 
Populati d (3)) and increase 

[) ion notr ted (by (51) an (3)) a 

b i caen PT 9 = 27:9. (The latter fact was noted by Williams in 


Ser 
Corp Ved by approximately / loge 
et et al. (1943), p. 51.) 


60 The wumber of new species covered when a sample is increased 


In general, if a fairly simple hypothesis // on the p, (e.g. any of H, to H, of G) gives ae i 
fit for the n,, we should prefer to deduce 6(n,(A)) and & (d(A)) from H, rather than use e 
distribution-free estimates (18) and (19); but such extrapolation should be made lé ^ 
caution, and the distribution-free methods may give a useful indication of the error 
be expected if H is false. g 

mr (iii). Sample of nouns in Macaulay's essay on Bacon. (From Yule (1944): 
Table 44, p. 163; quoted in G as example (iii), p. 260.) N = 8045, d = 2048. 


Table 6 A 

r n. fiy 7 n, "E 
1 990 1024 11 24 159 
2 367 341 12 19 | 13 
3 173 170 13 10 1r 
4 112 | 102 14 10 9. 
5 72 68 15 13 
6 47 49 16-20 31 
7 41 35:5 21-30 31 
8 31 28-5 31-50 19 
9 34 22-7 51-100 6 

10 17 18-4 101—oc 1 

| 


(As in the tables in G, n, and n, have been su 


mmed where values of 7 are grouped.) 
Here n; (= n; of G) is the analytic smoothir 


ag 
_ 2048 
"= ral) 
an explicit hypothesis on the p 
(53) is so simple in form that we can carry t 
we consider doubling the sample (A — 


(53) 
(H, of G; notice that this is not 
fit only for r < 30.) 
analytically. Again 


; od 
and that it gives à g9 


p tions 


hrough all our calcula 


2); by (18), 
^ d " 2048 
2222**(-1yf(i-1)5.——9 
58 33 C- Me y E 
= 3.2048 Y (- 1)! 
i-0 14-2 
= 2.2048(1— log, 2) 
+1260; 
A 2 i+1)(i+2) — 2048 
d = — Lt dios 
P lud ele ame I 
uc VES! 
— 2.2048 — 1) —— 
Me X (1077; 


= 2.2048 [S c-r. 


à 
€ É 
The first series is summed by an 


y standard method to i 
hence 


E. 


DA 
and the second is equal to loge ? 
fl(2) = 2.2048(8 — 2log, 2) 

= 465. 


l I. J. Goop anp G. H. TOULMIN 61 
Finally, by (19), 


^ L2 2048 
dy = 2048— X (- D' 5. 
ici iil) 
æ o[(—1) (-1y* 
2308|r— 3 (Sa 
| &( ; 391 )| 
Ed = i+1 
_ 2048.2 5 C2 
i=1 a 
2048. 2log 2 
= 2840. 


V i i . H H 
Notice that, since the n; give a good fit only for r < 30, the justification for substituting 


them in infinite series rests on the following argument: 
is n partial sum of the series} with the true &(n,) down to the term containing ó (ngo) 
hs d approximation to its infinite sum: 
(i i) the same is true for the series} with the ds 
(t) the n; are good approximations to the &(n,) for r «30, so that the partial sums 
Daily a the argument justifying evaluatio 
even ni hould be borne in mind whenever an 
ns a it is used to give values of n; which are trea 
isk of obtaining an apparently satisfactory convergence 
ss £^ ee it would probably be advisable 
Mites reader might like to try the smoothi 
ation. 


We hay i 
€ have now sufficient dat 


Proporti 
Portion of the population not represente 
024 12-7 9; 


n of integrals by the saddle-point method.) 
analytic smoothing is used in this way, or 
ted numerically; otherwise there is 
e which is in fact spurious. 
ry a graphical smoothing as 


able in such cases to t 
(E, 1) or č, method of 


ng n; of G, using the 


a to derive the result which was quoted in $1. By (3), the 
din the 2048 nouns of the basic sample is about? 


ited in the 2840-1260 = 1580 nouns occurring twice 


b 
Y (7) of G the proportion not represe! 
about 


or "d 
more in the doubled sample will be 
TET AS 

60-- 2.462... 13.6 %; 

16090 % 


y of the 1260 nouns occurring once only in the doubled 


b 2 
anl of G, the average frequenc 
e will be about «dig 
. 2.469 — 29.0046 %- 
1260.16090 


Heia s 
Ce, if we add a random selection of 
136-127 200 
0:0046 
urring twice or more, 


ple to all those oce 
me proportion of the 


of th 
e H 
nouns occurring once only in the doubled samp'e © 
7 oximately the sa 


We wi e ) 
vill have a list of about 1780 nouns covering app” 


9pulati ; 
lation as the 2048 nouns of the baste sample. 
IPS s me " 
i ,(2) above), the sum o the transformed 
Series £, if summation methods are used (as in caleulating fi(2) ) 


ased on the unsmoothed values. 


i Th 
9 figure of 12-3 % given in G was b 


62 The number of new species covered when a sample is increased 


APPENDIX 


Conditions for the lemma of §1 


$ ; be 
Although this lemrna was not actually used in our argument, we give here, for any reader who may 
interested, two fairly general sets of conditions under which it holds. . i 

If a,>0 for all r, and finite numbers b, are defined by (4), then (5) holds if and only if 


bru 


4 
5 +0 as i-o. (54) 
ti 

1 R (—1):5,,. 
Proof. Write R(n,r) = = x — HH a, 


r'i2o i! 


so that (5) holds if and only if R(n,r) —- 0 as n >o. Now, for all n> F 


R(n,r) 2 -5 ——, 2 sf*üg a 


ll 
P 
i Ms 
e 
N 
— 
R 
- 
[| 
> Ma 
| 
is 
© 
DA 
| 
a 
4 


a 


= 8\ (s—r—-1 
= 2 (-1)8 5 
a " () ( n E T 
using the definition (^ == 


pi: €ven if a is negative, together with the well-known identity 


hi bos] LT 

exe " T à x 

i t—1 i 

» and all other terms with s; +r+1 vanish; he 


R(n,r) = (—1)n x () ent 


Putting s = gives a term +a, nee 


s=n+r+1 n 
ao 8—7T—N (n+) (55) 
rq o Sees enim 
w Sembrpl] S—T mir! 
Now, since 4,2: 0 for all s, H 


eo 
| Rin, 7) | ek > airing, 
Air! 
wm A Brin, 
rini’? 
: ide 
and the sufficiency of (54) follows. The necessity is trivial, since if (54) does not hold the right-hand 9! 
of (5) cannot converge. 


, ult 
If a,= Ola"), O<a< 4, then (5) holds for all r; further, (5) does not hold if a, = 2-7, so this res" 
cannot be proved by extending the range of x. 


P ve 
me without loss of generality that |a, | <æ". Then it follows from (55) abo 


soin 


IRanj< È 


s=ntr Mir! 
_ (n+r)!ant $ evi Ca 
3 nir! 1-0 t 
cz n+r+1 
i) 
—0 as n — co provided x< 1; 


E 


ria 


hence (5) holds for all r. 
(ii) Taking a, = 2-7, we have ee ES rd (4) 
r=0 A 
= 2.4 


summing the series as in (i), and it is clear that the right-hand side of (5) i. not convergent for any 7 


I. J. GooD Ax» G. H. TOULMIN 63 


REFERENCES 


Bnowwien, T. J. PA. (1926). An Introduction to the Theory of Infinite Series, 2nd ed. Cambridge 
University Press. 

Corser, A. S., Fisuer, R. A. & WiLLiAws, C. B. (1943). The relation between the number of species 
and the number of individuals in a random sample of an animal population. J. Anim. Ecol. 12, 
42-58. 

Dickens, CHARLES. Our Mutual Friend. London: Thomas Nelson. (First published in 1864-5.) 

Goon, T. J, (1953) (described in text as G). The population frequencies of species and the estimation 
of population parameters. Biometrika, 40, 237-64. 

Harpy, G, H. (1949). Divergent Series. Oxford: Clarendon Press. . 

HANKS, D, (1955). Nonlinear transformations of divergent and slowly convergent series. J. Math. 

,. Phys. 34, 1-42. b T n 

ULE, G. U. (1944). Statistical Study of Literary Vocabulary. Cambridge University Press. 


[ 64 ] 


A SEQUENTIAL TEST OF RANDOMNESS FOR EVENTS 
OCCURRING IN TIME OR SPACE 


Bx D J. BARTHOLOMEW 


University College London and Scientific Department, National Coal Board , London 


1. INTRODUCTION : 
Ina variety of practical problems it is necessary to test whether a sequence of events 8 
occurring at random in time or space. To the casu 
Suggest fluctuations in the density of events, so that 
establish randomness or otherwise, jal 
The problem was discussed by Maguire, Pearson & Wynn (1952) as applied to industr 3 
accidents; they pointed out that the basic data in such cases consisted ofan ordered pop 
of intervals between events. Recent investigations into the possible departures fro 


ate to use a test based on intervé 
arting point. In this paper we e d 
events occur because the alternat y 
1. A sequential test is of special value in this case beca 


ach 
ble one at a time; provided that they do not follow € 
other too rapidly it should be possible to carry out a test as the process develops. 


ries will 
al observer even à random — "m 
an objective test is clearly requir 


2. THE ALTERNATIVES 


TO RANDOMNESS 
In order to derive a Sequenti 


i to 
à es 
al test we must first of all define the class of aspect in 
sensitive; this choice, of course, will be governed by the pra mos 
ts are accidents or machine failures, for example, it will be 


Pr {event in (7, T--dTy = A(T)dT +0(dT) (A(T) >0 for 7'> 0), 


where 7 is time (or distance) measured from the point at which Observat 
XT) is a monotonic function of T. The case, A(T) = constant 
process; this will be the null hypothesis corresponding to ran j id 
Will be convenient to refer to A(7’) as the rate at which events br 
problem thus becomesthat oftestingthehypothesis, ACT) = constant, against the alternativ 
A(T') increasing or decreasing. As an example of a situation Where this sort of alternati 
would be suitable, the data given by Maguire et al. (1952, Table 1 ine 
accidents, may be mentioned. Accident and failure dat: 
mples of this type. : 
irc raw material for the test consists of the times at which events are observed to occu": 


T T T,, and the first step must be to find their joint distribution. The distribution ° 
Pe k iig F 


) Concerning min " 
a, already referred to, furnish ma 


65 


T,) may be obtained from the 


D. J. BARTHOLOMEW 


T, clearly depends only on that of T; ,, so that p(T), Tos -+> 


relation 
n 
p(T, 75 se Tn) = IL» (T; (25). t= 0. 
Now Ti 
ow p(T; | Tia) = A(T) exp |- [ amar], 
therefore v Ti- 


n Tn 
pif, Ts ull) = II ACT) exp |- [ amar], (1) 


In order that this can represent a joint-probability density function a further condition 


mu : 
‘st be imposed on A(7’), namely, 


lim [amar = oo. 


rood 0 


This ensures that À (7) does not decay too rapidly; it is always satisfied if A(T) is monotonic 
Mereasing, T M 
Before the Wald test can be derived A(T) must be completely Spain Ne to 
AVOlve a scale factor " well as some parameter which determmes the degree of departure 
rom randomness. Without going farther than requiring that AC ) shall be monotonic 
creasing ör inerensing n wide DISS exists among the possible functional forms. ; It 
lerefore seems sessan&dilo to investigate the consequences luae qi that form which 
“ads to the most straightforward mathematical treatment. For mathematical convenience 
Yerefore we choose a (a»—1- 
ML) = pte T) ( 
za «0 and increasing for 0 «a «oo; the problem 
jesis a= 0 against either or both the alternatives 
ter which will be eliminated from the problem. 
w arises concerns how far the test derived on this 
alternative takes some other form. For example, 


it implies that when 7' = 0, A(T) = 0 if a>0 and 
gru Ped pe re dem rh f oe chosen for T'isof ups ape importance. 
Although the : fo : ihe test based on the eue a " pes ir Serm pale dor 
TRU Possibilo, 4 c es ood reasons for believing pw : pes ee “ cim Pow 
py nen B im a og me matter Will rp expe E in connexion with the 
Daan, aay a at Se ee fk may be note ms 5- E 5 eaaa carried out . 
Xed Moran (Bart [Same tent! teal E a Md 


1955) 
holomew; : der ider 
«ly high efficiency under a wide range of other alternatives 
?rnative form of A(T) has * fairly 
Seems reasonable, therefore: 


to assume that the sequential version of the test will have 
as Property. 


“or the joint density 


ng for -1 


"his ; A 
his is monotonic decreasit 
ypotl 


med becomes that of testing the h 
Ü or a. 0, wis a nuisance parame 

as he important question which no 
| UMption will be efficient when the t 


rue 


function of the 7s under the chosen alternative, we find 
u 
n 
= u” TI GU exp-(QT,)'/(a 
pne fy o Ta) ” L i : [ u n) [a V]. (2) 
a test which can be applied when the first few observa. diligo 


Tt have 

will irable to 9? 3; Mi i 
i im be v sh used to reach cipue decisions. For this reason it will be Sie) 

Issing or hav! e general density functi 

B more g y function of T., 7, T, (k>1 z 

lveni k with the lo tko 548 (E> ) which 
ent to wor 
Biom. 43 


5 


n: ::c——X 


66 T'est of randomness for events occurring in time or space 

i «T, 
may be obtained by integrating out Ti, ..., T. Over the region 0< T « T, « ... < T, ,«Ty 
We find 


LOFT (um yk-0(6). n "A - } (3) 
PT,» inis m. ae H UT) exp — (um, oia 4 1] 


t and ean be eliminate 


he observations, Any 
test provided it satisfies two conditions: 


: it 
d from the problem by carrying A 
transformation will lead to the sat 


k Ua 
i. ES Tk - ET a v "S TET, ... Tii 
|OHÉS "erg o mea mya = 
V = uT. 


„mation | 
The test is based on the new variables Vhs Vp... 9, 1. The Ji acobian of the transformati 


: a | 
is of the form Vr ana (y, 125-4), Where ¢, isa function of the ?'s not depending oP“ | 
We thus obtain the joint density of Uo Ukts +++) Un_y and Vag - 


a 1)); 
PM p "e n q a $i (v, mm V,-1) Pol, mers Vn) Pl exp 53 [ pens t 
where d, = ( /T,)*-16an be ex 


fa. | 
Pressed as a function of the v’s which again is independent 
Integrating out for V from 


0 to infinity, we finally obtain 


n—1)! (4) 
Pp ... v, ul) = a ha) ais, att $3 ds. 
—k) 
It is now Possible to form the Sequential probability ratio test based on the (* 
variables v,, Vki +++) Vaq. Thus 
Ra PCs vonala = a) 
(vy, T», Uni | a 0) 


= (I-Fag)n-kon _,, 
or taking logarithms and reverting to the original observations 


log R = (n—k)log (1 +a) —a,Q(n, k), 
n—1 
ern O(n, E) = —ogv, , = (n—1)1og T, — (5 — 1) log T, — oem. 


n—1 yi t 
1)=- X logi, a form which brings O" 
del idm 
the relation with the fixed Sample Q-test for testing whether i 
random on a line, already mentioned in $2. We shall return to t 


D. J. BARTHOLOMEW 67 


To carry out the test we adopt the rule: 
Accept H, if log R <log B. 
Reject H, if log R>log A. 
Continue sampling until either of these inequalities is satisfied. 
A and B are the usual constants related to the two kinds of error, « and A, by the equations 


Az(-f)e B=A/(1—2). 
In practice it is more convenient to rearrange the inequalities so that sampling is continued 


as long as 
-log B+ (n—k)log(1+40) ` Qin, 4) > log A + (n—k)log (1 +a) 

ý a 

ao 0 
ance boundary and the right-hand the rejection 


t " 
he left-hand quantity being the accept 1 
to show that the test terminates with 


?Undary, By the method of Cox (1952) it is easy 
Probability one. 

It Sometimes happens that the sample point (n, Q(n, k)) falls near the upper test boundary 

e of the (n+ 1)th event. The question 


and is followed by a long delay before the occurrene : É 
then arises as to whether a decision can be reached before this event occurs. Ifin the formula 


for Q(n + 1,k) we replace T, ,, by the time at which a decision isrequired, say 7" (Tp < T" <T,,,) 
‘nd denote the value of Q so obtained by Q'( +1, k) we have 

Q(n 4- 1, E) > Q' (r+ L, E). 

falls outside the boundary so will (n+ 1, Q( + 1, £)), so 
ckly in this way. No such rule can be used at the lower 


ause a long delay will always move the sample point 
f the lower boundary being crossed. 


1 Tf the point (n+ 1, Q'(n+ 1, E) 
decision can be obtained more qui 
Sundary, nor is one necessary, because E 

Owards the upper boundary when there is no question O 


4. PROPERTIES OF THE TEST 


(i) The operating characteristic (0.C.) 
The o.c. function, which is the probability of accepting the null hypothesis when a is the 
Ds ion, 


tr 3 
Ue value, is given by L(a)= AmE ga 
ET (a) — 
quation 


Where h(a) is the non-zero root of the e 


h(a) 
qx -- sala — 79] pne en Una | 0) do... dv, = 1, 
| been le=9 
a É " i Je point must lie. 
Nd S is the region in which the samp HU EA Ein 


Substituting known quantities in th 


(n-I)h ii pyle, DO ps - 
(1 +4) P 


yah é (vits |a) 5 1. 


+ U1 | 0) doy ... dv, , = 1, 


or (14 do. 


Now Una = Th Thra e Taal Ti. 
d 


T'est of randomness for events occurring in time or space 


1 
A T in 
Putting y, — TIT, (i = k,...,Àn— 1), Y = 4T, in the joint distribution of the 7 s, we obtain 


68 


L1) n—k 
PA. Vos a) = (1 Fay yena TE yt, 
Hence it is found that , JL. Menge 
Erla) = (14 Ta) 
+a 


so that the equation for h becomes 


n-k A 
(1+ i) = (14a), (8) 


which is independent of n and £. 


Corresponding to an 
and a there will correspond a seco. 


ay ead 40 — 
Y solution of equation (5) with A, 4% _ 
nd solution with — h, by and b where y 
by = —a/(1 +a), b= (@—a,)/(1 +). 

Table 1. The value of L(a) for 


" various a 


a | =1 0 a (ty | oo 
| | 
= —}j—— = = Oae | 
| 
L (a) V z | log A 
0 l—o — | 1 
(a « 0) | p | € log A —log B d | 
L (a) | | log A 
1 ise eats — 0 
(a, 7 0) J e log A —log B 2 | 
| 


Then it is easy to show that L(b) 2 1— 
L(a) approaches the value log A/(log 
side of equation (5) we find that 


zer? r 
L(a), with z and # interchanged. As h tends to 


:oht-hap 
4 —log B) or 1 if x = /. Expanding the right-h 


14 Alog(1 +a) +0(k?)=1 T hag] (1 +a), 
which has a root h = 0 ifa — q, 


n 
e 0 
ollog (1--a,) — 1. Denoting this value of a by a’, we hav 
expansion 


, 


a = 30, — yya + O(a$). 
The five points on the O.C. curve given in T. 
à rough indication of its Shape is require 
solving (5) for other values 


able 1 are sufficient for man 
d. A more detailed Solutio 
ofh. As a first approximation 2 1 — 


nly 
Y purposes am. 
n can be obtaine' ue 
?a|a,, more accurate V3 


z : Ww 
may be obtained by an iterative procedure or more simply from the intersection of the * 
pes Y = 1+haj/(l+a) and y=exp [^log, (1-+ay)]. 
Charts could easil 


Y be constructed for this purpose if required, 
bability of picking out a departure from rand. 


value of ais rather less than ap, the results, all calculated for 
perhaps instructive, 


t 
; n 
As an indication of the — 
omness falls off when the t! E 
2 = 0-8a, given in Table 2 ? 


(ii) The average sample number (4.5...) 
Let us define 


loo 2C: | Vena +++ Vr; @ = ay) 
eun p(vi| i-i ve @ = 0) 


D. J. BARTHOLOMEW 69 


Which in our case becomes 
z; = log (1 +4) + (log v; — log v;-1)- 


V a 
Vald (1947) showed that, provided the expected value of z; does not depend on i, the 


A.S.N, was civi i 
Sison by L(a)log B+ (1 — L(a)) log A 
ó (n | Ao) = — 


Where ,€ 5 " 
ere & (n | ay) and &,(z;) are the expected values of n and z; when a = a, is used in the test 


a 
nd the true value is a. 


Table 2. Values of (1 — L(2)) when a = 0:885 


| | | Value of 1— L(a) when 

a | á h T 

| | | a= = 0-05 | a=ß=001 

516 0-820 | 0-915 

—0-5 | -04 — 0-516 : | 
-27 ' e — 0-560 0-839 | 0-929 
04 0-32 — 0-633 Abe | 0-948 
1-0 0-8 — 0-644 -87 | 0-955 


previous section it is easily shown that &(—log v;) = i/(a + 1) 


)—a/(1 +a), 
e above expression for the average sample 
d showed that 


*u the transformation of the 
ence that &(z) = log (14-49 
oint a = a’ th 


Which e s 
"hic is independent of i. At the p 
this case Wal 


N h 
Umber becomes indeterminate; M 
&,(n | ao) = —1eg A log BJE GI 
a 
He al imum value of & (n | ay) usually occurs ator near thi 
80 remar 47) that the maximu Oa his 
Point, eee ets to that used above, or directly from a theorem of Epstein & 
Sobel (1954), it may be shown that a 
Az) = Los Q tap. 
Q2 -log4 log B/[log (1 +4 )}°. 
Na cs a giyen in Tables 3 and 4. Table 3 shows how th y 
me: i fg) are 8! " € A.S.N. 
vical values for (^ | e d the true valuea. Ifthe table is used with negative values 


Varies wi ice of a. P: % 
v > Á 
ith the choice of m be shown that 
tay and a, then it may east 
ln | bo) En | io) 


d the relation between bo, b and ap, a as given in equation (6) 
al interest because it represents a linear increase in A(T) Gone: 
b = 1+ 2b but this no longer has the same physical significance 
ise D D a ^. 
he values a = —1 and 4 ^ co represent the modb extra departures from randomness 
that are obt; santé with the chosen E of Wee aiu 
ain : bulation of ¢ TAE ; 

Table 4 gives a more n ca Table 3 it ma; zs | Sciens Eni" | ao) to assist in setting 
" ‘or y be use ati 
"Da test, In the same way 9 DE CRAWL Hg 


an 
d therefore (nja 


Wi 
ith æ ang 5 interchanged an 


5 e row a = 1 is of speci 
Ponding to this we have 


70 T'est of randomness for events occurring in time or space 


Table 3. Approximate values of E,(n | ay) for various a, f and ay 


e ay 0-2 0-4. 0-6 0-8 | 1-0 | 


| N |oo 05 | 0-01 005 | 0-01 0-05 | 0-01 0-05 0-01 005 | 
| | 
|2 EN C I P TEN RTI 
| 
| 0-01 0-01 
—1 ! or | or 
| 0-05 0 0 0 0| 0 0 0 0 0 0 0-05 
a5 | 901. 5 54 233 2 14 14 10 10 1 1 0-01 
228) oon | 58 35 5 5 9 9 6 6 5 5 0-05 
| 
| 0-01 | 256 T 712 617 36 33 23 21 " 
0 = P f a z 16 15 0:01 
0-05 166 5 7 43 23 21 15 ll 10 0-05 
| Ge 62 61 41 44 ) 
5 5 o 28 0-01 
| 0-05 | 409 261 | 120 62 39 41 25 28 


& 
e 
© 
= 
e 
e 
e 
cs 
2 
e 
m 
œ 
RI 
= 
sw 
so 
o 
e 


"1-9 dde ul, 
% | 9:05 | 9 7 : 5 5 o ; 24 16 0-01 
741868 190 | 88- m | as S0 Wo moa d] se 
0-01 | 94 61 66 43 — — =. 
95 | 0-05] 93 eg os Ole a Pe le ua 0b 
| | a — 0-05 
901] 6 — 36 | s4 ge | $$ er 24 
1-0 a 16 24 i 0-01 
Ons |, S5 36 a8- di | se i7 28 35 Jas 16 
| | 23 15 0-05 
ot en ee DE 
7 |005| 25 16 | l4 9 | 10 @ | Bee 5 E. o 
| T MEE. . 4 f 
| 991 0-05 | 001 0-05 | 0-01 oos | 0-01 905 | oioi pos a 
I "UD 
|- —— [—— EL B | 
A 


For a)>0, use the top and left-hand marginal scale; for Negative q 
and right-hand marginal scale. 9 


x m 
written as bo, use the botto 


(iii) Saving in sample size effected by the 
As an alternative to the a 


Sequential test 
fixed sample tests could beu 


lication of the Sequentia] ; 4 f 
i. In this case we should dui e described a anosa A 
number of events and then to carry out a test for a treng, At the kr vance to observe à e 
(1 — 1) observations TS... T,,_, would have been made; 4 dcos ofthe nth BCA 
verified from equation (2) that the variables defined by $us um erval (0, T.). It is p 
the same joint distribution as a sample of (n— 1) from the T ay a 152 dones) 
Ply) -(a*Dvy* (0c. 1). 


D. J. BARTHOLOMEW 71 


Table 4. The average sample number for the sequential Q-test 


The upper figure is 6, (n | ay) and the lower &(n | a). 


a| 001 | 0-025 | 0-05 0-01 0.025 | 0-05 0-01 | 0025 | 0-05 
| 0-01 0-01 0-01 0-025 | 0-025 | 0-025 | 0-05 | 0-05 0-05 
cT | | 
a | 
EaR 366 293 237 356 279 229 340 269 216 
e 329 320 305 | 363 | 251 | 242 | 213 205 194 
; mM * 2g] | 220 180 268 212 | 170 
0-200 289 231 187 2 | 
256 248 237 | 204 | 195 188 166 160 | 
: 52 | 228 179 147 218 173 138 
0-225 235 188 152 22 
205 199 190 | 164 156 151 133 128 121 
" | 190 149 122 182 144 116 
0-25 196 156 127 
169 164 157 135 129 124 109 105 100 
x 138 110 90 132 106 85 
0-30 144 115 93 8 : 
121 116 112 97 92 89 78 75 71 
ə | 108 85 70 103 82 66 
0-35 In 89 12 7 67 59 7 
91 89 85 73 70 5 54 
87 69 56 83 66 53 
O4 90 72 58 : 
a 72 70 67 58 58 5s zi os 43 
49 72 57 47 69 55 44 
0-4, 75 60 43 38 37 : 
S és BT 54 47 45 35 
62 49 40 59 47 38 
0. 63 51 a 36 32 
50 48 4 45 39 37 30 29 
: 45 36 29 
32 47 37 31 
0-6 48 39 26 23 23 
0 ae 35 33 29 27 21 
38 30 25 36 29 23 
0-70 39 31 22 21 21 18 18 17 
28 27 
21 32 25 21 30 24 19 
0-80 32 A 18 17 17 15 14 14 
22 22 21 
25 18 27 21 18 26 21 17 
0-90 28 14 14 12 12 ll 
18 18 17 15 
24 19 16 23 18 15 
1-00 24 20 16 12 12 ll 10 
a 16 15 15 1 10 


For negative values of the parameter, take bo = — ao/(1 +4) and interchange æ and £ so that we have 


E (n | Do) = a,l? | 0)» dy on | bo) = Eoln | a0). 
5 o 


72 Test of randomness for events occurring in time or space 


Pearson (1938) showed that the test based on the statistic 


n-1 n—1 E 
Qı = -2 X logy; = —2 X log T/T, = 2Q(n, 1) 
i= i=1 
was the uniformly most powerful test of the hypothesis a = 0 with respect to the class of 
alternatives defined by (7) with a» 0. By making the transformation u = — 2log y in (7) 
we find that plu) = (a+ 1) ev, 


from which it follows that Q, is distributed as y%,,_)/(1+a). A more general form of Qi 
corresponding to Q(n, k) has been given by the author in unpublished work. The following 
analysis is applicable to the more general case by writing (n — k + 1) in place of n. 

The purpose of this section is to investigate the saving in sample size achieved by adopting 
the sequential procedure; this is important because it means that decisions will be reached 
more quickly using this method. 

The number of observations, n, required for the fixed sample test of strength (a. P) is 
determined by finding the degrees of freedom, v — 2(n — 1), for which the distribution of x” 
has the upper 100/ % point = (1--a,) x its lower 100z % point. If we limit discussion to 
values of a, sufficiently small (say a, € 0*5), so that v will be large enough to make use 9 
Fisher's normal approximation to the distribution of y, i.e. take A (2x3) to be N(J(2v — 1), 1); 
then for given values of a,, x and f we have to solve the equation 


V(4n — 8) - X; = (1 +a)! (J(4n — 5) — X ,). i 
Here X, and X, are the appropriate standardized deviates of the normal curve, €.£: 
a= | 2 et de] (25). Solving the equation for n, we find 
m= LI (CGU ap! Xp)? — 0:495]. 


Table 5. Average percentage saving in sample size if sequential test used 


| 
fio, Qr 0-3 Limit as 
| a>0 
| 
a | 
0-05 001 005 0-01 0-05 0-01 
l 
mE E. 
0-05 57 68 54 56 51 63 
* brue 
0-01 52 63 50 62 47 58 pm 
0-05 43 4l | 46 43 51 47 
pue 
0-01 56 53 59 54 |  e3 58 poen 


Table 5 gives the average saving in sample size, using the sequential test, for various E 
expressed as a percentage. This is applicable for negative a, by taking b, = — ay p 
interchanging both « and £, and ‘H, true’ and ‘H, true’. d 

The saving in sample size is never less than 40 % over the range of dy, & and f that a 
likely to be encountered in practice, in fact it is often considerably more than this. 


D. J. BARTHOLOMEW 73 


E. 5. THE TWO-SIDED TEST 
ere interest centres on the detectior i 
ion of bot ^ ^ S i i 
othe cepa a h increases and decreases, a two-sided version 
n this ca " i 
E his cor’ we test the hypothesis H,(a = 0) against the alternative H;(a = a) or 
m i —a,); because of the symmetry of the two cases a = a, and a = —a)/(1 ba ) bm 
nveni N : à i 

^in enient to take a, = —d/(1 +4) = bo. although this is not necessary, of sinis : 
a methods have been put forward to deal with this situation. The first uses a likelihood 
es ormed by taking the simple average of the ratios required for the separate tests and 
= " E consists of running two one-sided tests simultaneously. The latter method will 
erred because it requires no additional calculation; it is not difficult to see the relation 


Mem the two methods. 

En the test is carried out are given in Armitage (1947); the procedure becomes 
E ws if it is possible to accept both H, and Hy, but this is only possible for unusual 
"The o.c n a and f, and even then the probability of such a happening is infinitesimal. 
ih E a unction and A.s.N. function have not been obtained for the two-sided test, but 

actice our knowledge for the one-sided scheme will be adequate for setting up 2 test 


Procedure. 


6. APPLICATION OF THE TEST 
are faced with the initial problem 
ter values for the alternative hypotheses H, 
In the present situation H, is clearly the 
e in other types of problem 
ernative to introduce 
ciated with different 


As usnal ; 
Rear ia setting upa sequential test we of deciding on 
and H M me specification of parame 
HMM dam H, also if the two-sided test is used). 
ires je is of randomness with @ = 0, but as is frequently the cas 
hu to se clear-cut simple alternative H,. The most appropriate alt 
Srl a settled paying regard to t N. asso! 

There ‘the parameter and & and Bi: 

in the is also further difficulty in the 
CE d a is less easy to grasp th 
able 6} ion or the probability p of a bino 
Occur has been prepared giving, for different 
» On the average, in successive equal perio 


avı 
in standardized by making the expecta 
at ugh these can be obtained from the figures for 
mulative totals after 5, 


th 
i bottom of the table the expected cur 
Values in the table are based on the relation 


N E 
M NG 


0,4) and N; 


he o.c. curve and the A.S. 


ignificance of changes 
the mean of a normal 
al interpretation of a, 
f events which would 
— 0. These results 


present case. The practical s 
an that of changes in, say, 
mial. To help in the physie 
t values of a, the number o: 


ds of time starting from m 
tion in the first period equal to unity. 


the first six periods, we have also given 
10, ..., 25, 30 periods. 


r n 
(0, "the = expected number occurring in ( — expected number occurring in 
rly St 
Vli ini 
Sa ee i bs seen that broadly speaking the A( 
8x ect, "p initial change in the expectation wi 
ation in the 3rd period is double that in th 


tw 
m 
ag Wel] re 


corresponds to a situation where there 
wing off. Thus for a = 0-4, the 
his is not doubled again until 


T)law 
tha gradual slo 
e Ist, but t 


it readily calculable. information regarding the distribution of the decisive sample number 
: 
Iso be valuable. 


as j 
ts average value would a 


74 Test of randomness for events occurring in time or space 


the 14th period is reached. The way in which this table may be used in combination with 
Tables 3 and 4 will be illustrated in connexion with the example given below. 

The fact that the test may be applied to a sequence of observed times 7}, 7]. ,, ..., where 
(E — 1) observations have occurred in the interval (0. 7), permits a certain amount of variety 
in its use. The most straightforward application is where the origin for T'is taken at the point 
of time when observations start; here k = 1. If a decision is reached at T;. ,, and we wish to 


Table 6. Expected number of events under the alternative A(T) = p(T) in successive 
unit periods expressed as multiples of the expectation for the first period 


—1 
| a0 a<0 
| Period - Poriod 
| 1-0 0-6 04 os | o2 | -$|— | -$|—-8]|-411| 
| | | | | ; E 
1 100. 100) 100 1-00 | 1:00 | 1-00 | 1-00) r00|r00|r00| 1 
2 300 203 1-64 | 1:46 | 130| 078| 070| 0-64 | 0-54 | 0-41 2 
3 5:00 | 277| 202| 171| 144 | 0-72 | 0-63 | 0-55 | 0-45 | 0-32 3 
4 700 339 230 | 189 1-54) 0-68 0-58 | 0-50 | 039 | 0-27 4 
5 900| 394] 256, 204! L62| 065] 054] 047|035| 0.24 5 
| | | | | 
| 6 1L00 445 | 277| 217 | 1:69 | 0-63] os2| oaa] oss | o21 6 
7 1300 492| 296| 2-28 | 174 | 0-61] o50! 0-41 | 0-31 | 0-20 i 
8 15:00 | 536| 313| 238 | L80| 0-59, 048 | 0-40 | 0-30 | 0-18 8 
| 9 17-00 | 577| 329| 247| L84| 0-58] 047| 039 | 028 0-17 9 
10 18:00 | 618| 345| 255| 188| 057! 0-46! 0.38 0.27 | 016 | 10 
lö | 20:00) 796  408| 290 205| 053| 042 0.33 | 023 | 013 | 15 
20 39.00, 951| 459| 317| 217) O51| 039| 0-31 | 021 011, 20 
25 49.00 10-91 | 504| 339 | 227 | 0-49 | 037, 029 | 019 | 010 | 25 
30 59.00 | 1219! 542] 3591 236| 0-48 | 0-35 | 0-27 | 0-18 | 0-09 30 


Expected cumulative totals 

1-5 | 2500| 13:13) 9-52) 810! 690! 282! 345 | 3-16 | 2-73 | 224 | 1-5 

1-10 | 100-00 | 39-981 | 25-12 | 19-95 | 15-85 681| 588| 518|492| 3160 | 1-10 

| 7616 | 44-31 | 33-80 | 25-78 | 9-55 | $03 | 6-92 | 543 | 3.87 | 1-15 

1-20 | 400-00 | 120-68 | 66-29 | 49-13 | 36-41 | 19-14 10-02 | 8-50 | 6:50 | 4-47 De 

1-25 | 625-00 | 172-47 | 90-60 | 65-66 | 47-59 | 14.02 | 11-89 | 9-97 | 7-48 | 5-00 | 1-28 

1-30 | 900-00 | 230-88 | 116-94 | 83-23 | 59-23 | 17-02 | 13-68 | 11-35 | 8-38 | 548 | 1-30 
i i i | | 


i | ! ——— 


" à of 
continue looking out for changes in A(7’), we may now start again with a new sequence 
times Tf = D. LT. (i = 1,2, ---). This procedure would clearly be appropriate if, 
example, the events were machine failures and the machine was reset when an increas? 
the failure rate had been established. 


$ NERO sans i62 
On the other hand, if a decision in favour of H, is reached after (£4 — 1) observations it io 


n H H " i n 
possible to continue observation with T' measured from the initial origin. The test statist 
to be compared with the boundaries given in §3 would now be 


Q(n, ky) = Q(n, 1) x Qh, 1). ied 
It is important to notice that the inference made at the second decision in this case app 
to the whole period from the initial origin and not to the period from the last decision. AS Jo 


- - 


D. J. BARTHOLOMEW 75 


as decisi r 

E ía be same kind are reached it would seem that the process can be continued. 
decision Row H p He io 1) AME ee sth decision. If, however, there is a change in 
Bites esta d i 2 : : " vice versa, justification for the further use of the original origin 
iis Checo ul. Tt is clear that here, as in all sequential applications, if a dange in 
Deak ot aw occurs during the course of observation, the strict theoretical basis of 
cii aks down and its utility must to a large extent be based on empirical study. Some 

investigation of this kind is in hand in the present case. 


7. A NUMERICAL EXAMPLE 


We ha , 
oe the data given by Maguire et al. (1952) for explosions in coal-mines in Great 
n since 1875 involving the loss of ten lives or more. These authors gave the intervals 


in day: 
a $ ; 3 
ys between successive accidents from which values of T' with origin at 6 December 


1875 TOR 
5 can readily be found. Fig 1 shows a plot of the rer 


1st decision 
° 
1 
6 7 
2nd decision 
. . 
. . . 2 
: 8 3 10 11 12 13 44 
3rd decision, D E 
. 
i 15 16 17 18 19 20 21 
e P E eM 
21 
La 22 23 24 25 26 £8 


ent data. Time scale: one unit= 1000 days. 


Fig. 1. Time plot of mine accid 
6 December, 1875. 


Origin at 
beginning of the record, we are on the look-out for 
This means that a must be taken negative; for 


ea; S t r 
i ness in reference to the tables we shall denote this value by b,. In a situation of this 
a Ha Should like to reach a provisional opinion as to what was happening fairly rapidly 

8 the ere would be no object in playing for safety using very small values of « and fj. Let 


Then *efore take the largest values of a and / given in Tables 3 and 4, namely, a = f = 0-05. 
foy b ' Using Table 3, let us examine the consequences of choosing particular values 
o , 


(i) it 
Mug fwe adjust the test so as to be effi 


ex n : 

M | Dect to wait a very long time for 
Usin o) = 151. Supposing an accident rate of 3 pe 
€ test so adjusted we might expect to wal 


Let 
Svi lee suppose that, starting from the 
Ce of a reduction in accident rate. 


icking out a small value of by, say —4, we 
an answer. It will be seen that &j(n | bo) = 170 and 
er annum in the year 1875, thismeans that 


t 57 years before establishing that A(T) 


cient in p 


76 Test of randomness for events occurring in time or space 


was remaining constant. The reason for this is that a change in A(T’) represented by the 
expectations in the column headed b = —4 in Table 5 would be very hard to distinguish 
from a random situation because, after the initial drop in frequency (which may be obscured 
by sampling fluctuations) there is only a very gradual falling off. For example, as Table 5 


shows, with b = — 1 the expectation in the 30th period is 26 %, below that in the 5th period 
while for b = — 2 it is 49 % less. 
Scale of n 
0 10 20 30 40 50 60 70 80 
Bh a ae ae =i T T T T T p 
50- . 


Note: boundaries are 
solid lines for «- = 005 
broken lines for a= -.040 


Scale of Q(n, 1) 


Scale of Q (n- 23, 1) 


> 1 L 
a Scale of n 


Fig. 2. Mine accident data; application of sequential tests with ag( =b) = }« 


T ron 

(ii) If bj = —1 and the true decrease is more marked, say b = — $, then we i P 
Table 3 that this change would be detected, on average, after 35 observations. t This 15 ^ 
satisfactory, but we note that if the test had been adjusted to be most efficient for b = 


> H ui 
i.e. by making b, = — 8, then on the average only 21 observations instead of 35 would P 
been needed to reach a decision. 


re f with 
(ii) Keeping bo = —4 we should expect to detect a smaller decrease of — b! = —35 b b 
probability $, but this, on average, would take 261 observations. To sum up, with bo ^ ` 49 


e 
we shall stand a good chance of detecting any decrease of importance, but we may ha 
wait a very long time to do so. 


« mee 
(iv) If we go to the other extreme and take b, = —$, Table 3 shows that a decision V p 
reached fairly quickly whatever the true value of b. A decrease with b = 1(3b)— 1j* 


T This figure comes from the section of the table with b) = —1, b= 1(35,—1)— — 2 


D. J. BARTHOLOMEW . "n 


a 


will be reached almost as quickly as if bj had been chosen equal to —45; however, though 
decisions will be reached quickly if b < bg, say in the neighbourhood of — 3, half of them will 
be incorrect, i.e. half will be in favour of H, or randomness. 

These considerations suggest that an intermediate value of by, say —} (corresponding to 
the positive value of a = 0:5) will be most appropriate. If the series is random we should 
Teach a decision for H, on the average after 43 observations, while if the expectation is 
decreasing with a, = — 3, after 29 observations. 

If we are prepared to take greater risks of wrong decisions, e.g. make @ = f = 0:10, the 
Process could, of course, be speeded up. 

Fig. ? shows the values of Q plotted against n, taking a, = — 1. This diagram should be 
Studied in conjunetion with Fig. 1. We start on the left with values of 


n—1 


Q(n, 1) = (n— 1)log;o Tu E log Ti 
i= 


and draw the boundary linesfora = 8 = 0:05 (solid rules) at (n — 1) x 0-528 + 3-836. A decision 
In favour of randomness is reached at the 23rd accident for which 75; = 2326 days. Had 
We used o = f) = 0:10 (boundaries, the broken lines) a decision in favour of randomness 
Would have been reached at Tio E 
If it is wished to continue the sequential test beyond 75; (about the beginning of the year 
1883), two courses are now open. We may keep the origin at the original starting point and 
calculate Q(n, 24); this is effectively done by still plotting Q(n, 1) but shifting the limits as 
Shown in Fig. 2, - that the new origin coincides with point Q(I,, 1). Following this pen we 
teach two successive decisions in favour of a drop in the accident rate: at T;, (8042 days), 
aadi 7s (16051 days). These results are shown in the upper part of Fig. 2. 3 = 
n the other hand, we may start again with a new origin at T», taking TE sa F n e 
E plotting Q(n — 23, 1) as shown in the lower part E ex Len Y bare : pos 
While it appears as though a decision in favour of Hy will be reached, 5a d 
31st Observation since the first decision, the trend is reversed. Had we used PNEU 
a decision į j n made at 71, (10,024 days), but with a = f = 0:05 
it is ferar ert that e should come down in favour of an 
75 (16, à 


SStablisheq i i ate. e —— 
_ itis Héros a ca general conclusions from this simple wem e d 
men, Whena drop occurs, it cannot be described in terms ofa Ki dare p ns : ni 
1 appearance of the spots in Fig. 1 suggests the existence - paa of the test. It 
9 think, however, that the illustration throws some light on ^ ü iie A E 
a Suggests points for further investigation. We may ask ed n 
"'àched more quickly when the origin is not changed after et s b iá ( T- Ty 
(a) because the fall-off in A is better represented by m ad me E es a -— 
in M because the high initial etd pedem Cs m d m g 
enhanci à e „in the e3 Xn, k) 
l Se juan d am investigation i vis Mee M peces 
| native trends having different mathematical forms crea: he " Lo ess is 
Be “St is based, A beginning has been made by applying the iul du P -thol - 
erated a 8 d 14T). Details of the results are given in Bartholomew 
(1g y the law A(T) = 4(1 + 4% 


55). 


78 Test of randomness for events occurring in time or space 


8. SUMMARY 


A method has been given for testing whether a sequence of events is occurring at random - 
in time or space when the alternative is a trend. The test is derived on the assumption that 
the alternative can be represented by a function of the form A(T) = HIT) for —1<a<®: 
The theory of sequential analysis is used to derive the test together with its operating 
characteristic and average sample number functions. The application of the test under 
various circumstances has been discussed from the theoretical standpoint and illustrated 
on mining accident data. 


My thanks are due to Dr N. L. Johnson, who supervised this work, and to Prof. E. 8 
Pearson for much valuable help in preparing the work for publication; also to the Depart- 
ment of Scientific and Industrial Research for the award of a maintenance grant. 


REFERENCES 


ARMITAGE, P. (1947). J. R. Statist. Soc. B, 9, 250. 

BARTHOLOMEW, D. J. (1955). Ph.D. Thesis, University of London. 

Cox, D. R. (1952). Proc. Camb. Phil. Soc. 48, 290. 

EPSTEIN, B. & SOBEL, M. (1954). Ann. Math. Statist. 25, 373. 

MacurnE, B. A., Pearson, E. S. & Wynn, A. H. A. (1952). Biometrika, 39, 168. 

Pearson, E. S. (1938). Biometrika, 30, 134. 

Warn, A. (1947). Sequential Analysis. New York: John Wiley and Sons Inc.; London: 
Chapman and Hall Ltd. 


[ 79 ] 


ON THE MOMENTS OF THE MAXIMUM OF PARTIAL SUMS OF 
À FINITE NUMBER OF INDEPENDENT NORMAL VARIATES 


By A. A. ANIS 
Chelsea Polytechnic, London 


SUMMAR Y. The paper is concerned with the maximum U, of partial sums Xj, Xi Xs ..., 
Xp Xue +X, of n independent standard normal variates. 

Tho distribution of U, is of interest in the theory of storage. Suppose we have a reservoir of infinite 
capacity, which receives every year & random input, from rivers, etc., whose distribution is normal 
Qu 1), and releases, for civil purposes, the mean discharge x. The probability that, starting with an 
initial water level z, the reservoir will not run dry in the following n years is given by the distribution 
function Falt) of Un- 

, The first and second moments of the variate U,, have been previously obtained (Anis & Lloyd, 1953; 
Anis, 1955). Each of these moments was studied on its own merits, and no systematic method of attack 
Was seen at that time. In the present paper, à method for obtaining all the moments is discussed. 
A recurrence relation is obtained which makes possible their numerical evaluation. 


1, STATEMENT OF THE PROBLEM 


1, " j ‘ 
Consider n independent standard normal variates Xy, Xs ..., X, and their partial sums 
= Troti = 1, 2, ...,1). 
r 


denote the maximum of these partial sums. Our problem is to obtain the moments of Up- 

e shall always use the symbols (x), P(x) to denote respectively the frequency function 

and the distribution function of a standard normal variate. We shall also use F, (v); f. (2) to 
Mote respectively the distribution function and the frequency functions of Up i.e. 


F(x) = Pr(U, «2 fa) = Fa). 


* p FY) -Í eri Q(t) dti (1-1) 
K 1 
b 
"here the region of integration K is defined by 
K: Easy (r2 1,2, easy Ho). 
1 
I 2. THE MOMENTS AS A LINEAR FORM or THE M;(n +1) 
t . 
ay be deduced from (1-1) that 2 
7, 09 = | 1090 din 
0 
and * * : 
po fuo = dee [all etdi i 
(2:3) 


lso ;,; iol : 
5o it is well known that g-i) = $e) x He 
Wh 


e; i ituting fi 2.3) i 2: 
We P Hy ) is the Hermite poly tituting from (2-3) into (2-2) 


nomial of degree j. On subs 
agate) = 962) 3, Hle) Ahi + DU (24) 
N ja 


80 Moments of the maximum of partial sums 


where M,(n+1) =| "if, (0) dt (j+ o)| (2:8) < 
" | 
and Mj(n+1) = 1. | 


Multiplying both sides of (2-4) by H,(x) and integrating over the range ( — %0, 00), we obtain 


Mj(n 4- 1) E Hj(x) f, (2) da. (26) 


Using the well-known explicit forms for the H,(x), we easily obtain the following relations: | 


pi o 1) = M 1), 
p 1) = 1-4 Mj 4 1), (2:17) 
an+ 1) = 3M (n 4 1) 4- M,(n 4-1), ... ete. 


, 


atf, ax) da. (2:8) 


where (n+ 1) =Í 


Hence the problem of evaluating the moments is reduced to that of evaluating the functions 
Mn+ 1). 


3. M;(n--l) AS A LINEAR FORM IN THE F0) 
On applying equation (2-2) to reduce f,(!) to f, aD, we obtain 


Mj») = Fo f ugod | [Peta yo aya 
0 0 0 
Continuing this process of reduction from foa to f, os, ..., ete., we finally get 


Mjn--1) = Y, alr, j) F, (0), (n 


n—r 
r=1 


where aj) [n “wh ne) Sle t2 da 13 $0) Hay, (> ». P 
alij) =| vetu) dy. | 


It may be appropriate to recall at this stage an important lemma proved in a previous pape 
(Anis, 1955) and which we need in the sequel. This is 


ao 
X R()r = 0-03. (3) 
r=0 

4. A RECURRENCE RELATION FOR THE ar, j) 


In an earlier paper (Anis & Lloyd, 1953) the integral 


b, =f; (s—1) f “8H Bs =U) = P02 Yo) 1.) TI dy = (ray (EP 


was discussed. Here we shall show the whole results depend on this integral. We make us? 
in (3:2) of the identity 
(r—5- 1) (Ys — 9. 1— 954) (4?) 


Me 


Y= 
s=1 


: A. A. ANIS gl 
(with the convention that yy = Yı; Yr+ı = 0): hence 
a(rj) = S "o[ ese nena, aeos : 
s=1Jo ‘Jo s- Ys- Your) APY Yo) + 99a Y) 97) I dy;. 
The first term of this sum may be integrated by parts an i ne tir eaten (4:3) 
E n(j-1)a0.j-2) G223) 
s may also be integrated by parts; the presence of the terms (2y, — 4, , — Ys+1) 


has t] "mod 
E he effect of dividing every integral into two multiplied parts, one of the form a(r, j) for 
r, j and the other of the form b, for some s. This leads us directly to the recurrence 


relation 
r-1 
a(r.j) = r(j-1)a(.j 2) Y (r— 5) b, .,a(5.j — 1). (4-4) 
s=1 
Woi 
J = 1, we get alr, 1) = $ 55, £40): did 
the introducti s-1 
oduction of F,(0) in this last relation is due to the fact that 
"m o co 8 
a = [P (fet dott 05710607 T 0r "m 
i.e, 0 
£0) = a(s, 0). If we define c, as 
th C, = rb, z (217), (47) 
er lk 
n (4-4) and (4:5) may be written in the form 
r-1 
a(r,j) 2 r(j— Da(.3—2)* yo ,05j-1 022)» 
s=1 (4-8) 


) of the original variate U, are 
s known (3:1) in terms of 
lving only theknown 
in the next section 
tion for their 
function for 


We 
n . ) 
ow have the position that the required moments pun. 


n 

hes fon in terms of the functions Mn), which are themselves 

Coefficie lonsa(r,j); and we have a recurrence relation for the alr, j) invo 

We on c. We shall not require the values of these functions a(r, j); 
ce the recurrence relation for a(r, j) by & difference-differential equa 


Senerat: à 
ating functions, and show that from these t a generating 
n 


t 


we may construc 


e M. 


OF THE GENERATING FUNCTION a(r, J) 


5. 
Tst À DIFFERENCE-DIFFERENTIAL EQUATION 
us di : 
efine the generating function Q,(t) as follows: 


att 


Qt) = Xam" (>D: 
ae (5-1) 


Ql) = Xia, Or = ERO = U -i 
r=0 r= 


( 


usin 
5 the lemma of (3:3)). From (4:8) we know that 


a(1,j) = (j- Da(5,7-2» l 
a(2,j) = a(j - 1)a(,j 2) +l j- D^» (5-2) 
6 a(3,j) = 38 — 1)a(3,j - 2) +d — 1) eg +.a(2,j — Her ete. 


82 Moments of the maximum of partial sums 
If we multiply both sides of the first equation by £, the second by ??, and so on, and add the 
terms a(s, j — 2) vertically but those of a(s, j — 1) diagonally, we get 

Olt) = J- DiN (22 (5:3) 
where a(t) = > L gp (5:4) 


a(t) is absolutely convergent for all values of t in the range |t | « 1. Ifj = 1, then the second 
relation of (4-8) would lead us to 


Q,(t) = a(t) Qy(t). (5:5) 
We may observe with the aid of (3-1) that 
Qolt) Q(t) = (0 —0)-* Q) 


ao n a 
= 2) Dal, j) Fp(0)} t”, 
n=1 \r=1 


or UNUD = X Mn+, (5-6) 


i.e. Qo(é) Q;(£) is the generating function of M, (n+ 1). We have earlier reduced the questio” 


of evaluating the moments to that of evaluating M;,(n +1). Equation (5-6) reduces this las 
question to that of obtaining Q(t) Q,(t). 


It may be appropriate here to work out, as examples, the explicit expressions of My 
M, Ms. 


The value of Q,(t) is given by (5:5), Q(t) and Q4(f) could be obtained directly from (5:3) 
as functions of a(t), Q(t) and their derivatives. By this method we get 


Qut) Q(t) = (1—4) a(t), 
Qut) QA) = $t — t) + (1 — t) (0, 
Qo(t) Qa(t) = $01 — t) a(t) + A1 — t) a (t) + (Y —t) (t). 


Equating the coefficient of t” on both sides of these equations we get 


Mn+ 1) = (27) Y. 
r=1 


M,(n+1) = 1n4- (miS X {s(r—s+1)}-4 


r=1s=1 


, 


Meth) = dnm Y rie yn)- HE re (27) sd 5i X bI ks bk1)(r- ny. 67 
r-is-1k-1 


The values of M,, M, and M, were computed from these formulae. The third term of Ms wee 
computed using the recurrence relation 


nnn) = Si-pln—r +1), e 
where pln) = X, X(sr-s4-1))-3 o? 
r=1s=1 
nor s 40) 
and pín)- X X Sfk(s—k+1) (r—s +14. (67 


r=1s=1k=1 


A. A. ANIS 83 


rm the values of p,(7) are available from the computation of M,, this recurrence relation 
Es m the computation of p,(r) (and hence M,) considerably. Unfortunately this 
cedure could not be continued much further since the formul i i 
e 
NI ae rapidly become 
f the following section we show how to obtain MM, (n + 1) for r > 3, and hence the moments 
e from recurrence relations which depend on the values of M,, M, and M, obtained above. 
. e may also mention that an explicit solution may be found for the difference-differential 
quation (5:3). Details will be published elsewhere. 


6. A RECURRENCE RELATION IN Mj(n 4-1) 


We have sl y 
| hown that Qolt) Qt = X Mn+ 1)t. 
n=1 
Hence Qi) = 0 —0* $ Mn 1)", (Sa) 
n=1 
and QG) =-4 42 5 Mj(n 4 1)" 4-905 2 aM ne. (6-2) 
n=1 t 


On Substituting from (6:1), (6:2) into (5-3), and equating the coefficient of t” on both sides 


of | 5 

the equation we obtain the recurrence relation 

-—: A n—1 
M,(n+1) = "s c, Mj (n —r + 1) (j - D nj s (n 1)- 160-1 z Mart 1. (6:3) 
i Pel a 

a 1S recurrence relation finally gives us à practical method for the numerical computation of 
e moments, The computation of M,(n + 1) was done by using (6:3), taking for M,, M, and 
h the values given in $5. From these it is a very small step to the first four moments of GE: 
e following table gives forn = 2(1) U.: the moments about the mean, 


h 15 the mean value of U,; t 
% Ha, 444; and the moment ratios yı = mra Yo = Itali — 3- Tt will be seen that as n increases, 


D s P , é 
€ distribution becomes increasingly asymmetrical and leptokurtic. 
Short table of the first four moments of Yı» Y2 
É 1 lz ls Ma Ms Ys 
p —— a : | 
| D " 

z 0-3989 1-3408 0-3265 56733 | 02103 ed 

3 0-6810 1-6953 0-7881 g-3981 | 0357) oen 

4 0-9114 2.0536 3540 | 141409 pr 4 0-4186 

5 1-1108 24136 | 20075 19-9160 0-535 

6 1:2893 27745 | 27385 | 26-7048 0:5926 Moy, 

7 1-4521 3.1360 | 35392 34-5131 AH 0-5425 

? 1-6029 3-4978 4-4042 43-3413 0:6732 0:5700 

3 1-7440 3-8599 5-3289 53-1895 0-702 0-5932 
10 1-8769 4:2222 6-3098 64-0580 0-7273 5 

: 0:6132 
u 2-00; E 7.3438 75-9468 0-7481 
12 Hier nct 8-4282 88-8562 0:7660 0:6304 
3 223: 5-3100 9-5609 102-7862 0:7814 0-6 5 
l4 249 5.6727 10-7400 117-7369 0-7949 ue 
15 | 24558 6-03551 11-9634 133-7086 0-8068 0- 
6-2 


84 Moments of the maximum of partial sums 


7. THE LIMITING VALUES OF 7, y» 


From the linear relations between the Ms and the moments of U, about the origin, it is 
easy to obtain the following relations: 


Jt = M,- M11. | 


u, = M,—3M, M, +24}, ! (7-1) 
ji, = M,—4M, My + 633 M, — 3M} + 6M, — 6213 +3. | 
Hence y, = (M,— 3M, M,+2M3)|(M,— MT 4- 1). ) c2 
y, = (M, — 4M, M, + 6M} M, — 3M1 + 6M, — 6M} + 3)/ (M — MS + 12 — 3. 
Now it may be proved that 
lim n-3M;— aunt =) > Jn» (7:3) 


This may be done by induction, using the recurrence relation (6:3) of the Mj's, and sub- 
stituting for M; ,. M; ; their asymptotic values obtained from (7:3) for sufficiently large ” 
The summation sign in the recurrence relation would be replaced by an integral sign in the 
course of the proof. Since (7-3) is true for j = 1, 2 (Anis & Lloyd, 1953; Anis, 1955) it isin 
general true. 


Computing these limits for j = 3, 4 from (7-3) we may proceed easily to obtain the limiting 
values of y,, 7, when n tends to infinity. These are 


dE 1 1 2\3 a5 

due e 7 Ji E Mid 
8 3 22 Á 

n= 5 (1-3) /(1-2) ia ait 


I am grateful to Prof. E. S. Pearson for drawing my attention to the question of the 


limiting values of y, and y,. I am also indebted to Mr S. Michaelson (Imperial College) fo 
his part in the computation. 


REFERENCES 


Ants, A. A. (1955). The variance of the maximum of partial sums of a finite number of independ 
normal variates. Biometrika, 42, 96. 


Ants, A. A. & Lrovp, E. H. (1953). On the range of partial sums of a finite number of independen” 
normal variates. Biometrika, 40, 35. 


| 
| 


[85] 


ON THE APPLICATION TO STATISTICS OF AN ELEMENTARY 
THEOREM IN PROBABILITY 


By H. A. DAVID 


Commonwealth Scientific and Industrial Research Organization, Sydney, 
Australia, and University of Melbourne 


1. INTRODUCTION 


Let D... denote the joint probability of r ( zn) events A;, A; .... A,; and S, the sum of the 


n 
r) P's with r different subscripts. Then the probability p,» of the realization of at least 


n g 7 
? events out of n is given by (see, for example, Feller, 1950, p. 74) 


"m m+l—1 
Dm,n = AC 1} m-—1 ) Su (1) 


Weshall be concerned with various statistical applications ofthis theorem. Let y;, Js +++» Yn 
E. a set of random variables not necessarily independent or identically distributed, and 
mae +++) Yin) the same variates arranged in descending order us magnitude, We n iy 
ien o &( Y) for the joint probability of the r events i> Y,y;? Y.-- hs y J m 
joi ingly , Pm, n(Y) will be the probability that Yom exceeds Y. In the special case when the 

nt distribution of the y's is a symmetric function of the y’s we have 


a 
"d (1) takes the form 
pear P4 (2) 


» n-m n Y). 
TS 7 X CU (ng Pi. ni 


= m-—1 
i- d 


In " : 
Particular, for the important case m = 1 we have simply 


n n - 
pro» n - Ec n7) Be 
to determine the distribution of the 
ances all following à x? law with the 
and its generalizations are of much wider 


(3) 


ochran (1941) 


E : 
t quation (3) has in fact been used by C ) 
e vari 


ati 
E of the largest to the sum of a group of sampl 
Number ; However, (3) 
appl; r of degrees of freedom. However; j : - 
Pplicability as a important statistics are expressible as maxima. Examples are the 
^ aximum F-ratio tthe ein of a set of F-ratios with a common oe the a 
© extr +. studentized forms. Lhe method is 
Day Xtreme deviate from the sample mean, and their studer 
o 


rti à "NS 3 
ticularly convenient for the determination of upper per centage points, on the assumption 

* T pum it permits also the evaluation of 
n.* In a 


ber of cases f 
f this function needs a little care in the present 


a nor 
* e d parent populatior 
Context function, but the interpretation 9 
as first submitted for publication an 


paper W 
alperin, Greenhouse, Cornfield & 


t " 
May be mentioned here that since this 
loyed by H 


a 
PProach ver been emp 
ery close to ours has bee 


ints -atistics expressible as minima 
wer percentage points of statist p 


* 

Simi 

(ce, Ret remarks apply, of course, to lo 
ley, 1938). 


86 Application to statistics of an elementary theorem in probability 


Zalokar (1955) to obtain upper percentage points for the ‘studentized maximum absolute 
deviate in normal samples'. Some of the criteria listed above will now be considered in 
more detail. 


2. THE EXTREME DEVIATE FROM THE SAMPLE MEAN 


Let z, (t = 1, ..., n) ben independent normal variates with unit standard deviation. Applying 
equation (3) to y, = z, —z we have 


Pr ge = EY) = (i) Pre-2> ¥)=(5) Pre -2- Y,%,—%>Y)+... 


= T(Y)-T{¥Y)+...+(-I"42,(¥) (say). () 
Clearly, for reasonably large values of Y a first approximation to the left-hand side 1$ 


provided by 74, with 75,7',... as successive correction terms.* This first approximation; 
namely E T cà» 
T (nas. 1» Y) =n] cos Jam 

was suggested by McKay (1935), but its remarkable accuracy for the determination of the 
usual range of upper percentage points was established only when the exact values were 
tabulated for n < 25 (Nair, 19485; Grubbs, 1950). The position is easily appreciated from (4) 
in which T, may be expected to be small for large Y since z,—Z, x,—Z are negatively 
correlated. T, can, of course, be evaluated from tables of the volume under the normal 
bivariate surface. However, without doing this it is possible to obtain a simple gauge for tha 


accuracy of Y, (x), the first term approximation to the upper 100294, point Y, of zgax. —** 
By Bonferroni's inequalities (Feller, 1950) 


T, — T, « Pr (Xx, 7$» Y)« T. 
But for Y = Y, we have 


T= j (5) 
and T; < &n(n — 1) [Pr (x, -7> ¥,) 2, 
so that 


& —1(n— 1)o*[n « Pr (trax — 9» Y) <a. 
We have therefore that for all n and c = 0-05, 0-01 
0-04875 0:05 
j «Pr max. -EY)« 

0-00995 0-01. 
. I a r 

> second approximation Y,(«), which underestimates the true value, is given by solving F 

-Hie 7,3) = a+ Ta). 


Im ! in a simpl 
Is 1s not very convenient, but we may replace T,(Y;) by }(n—1)a2/n to obtain a S! 
and generally very accurate second approximation. 


21. The power function of the extreme deviate test ne | 
In using z,,. —Z to decide whether an outlying observation should be rejected b 
generally has in mind the following hypotheses: 


i ion 

N ull-hypothesis Hy: The x, (t = 1, --- n) are all drawn from a common normal popula 

with unknown mean y and known variance which may be taken as unity. 
n B H he? 

* Tf (4) is applied to the ratio of amax.— to the standard deviation of the same sample of n? of 


À pas eee s ( 
since this ratio is bounded, 7’, will give the left-hand side exactly for sufficiently large values of Y i 
Pearson & Chandra Sekar, 1936). 


H. A. Dav 87 


Alternati 3 
with E m a : One or possibly a few of the x, come from a normal populati 
: ean j.-- À and variance unity thi ini i n 
null’ populatio y, the remaining a belonging to th 

n. There could also conceivab: : aiiis 

| reli nceivably be more than one non-null populati 
ton - sel eod] the test sequentially, that is, having found that the i Rd 
En rejec ec one may proceed to test the second largest, and so on. S - 
Posten enn been discussed by Hartley (1955),* but we shall confine edendo the 
— in hen a single test is carried out. If we take the event A, to be that 2,—%>Y,, th E 
— (1) will give for m = 1 the probability of establishing significance. "This vill a 
ered in detail when H, specifies only one of the a, say Xy as a true outlier. We have 


P(A)=Pr (tnax. — 8» Y, | Ah) 
- = (n—1)Pr(u,» Y2) * Pr(u » Ya-4)— (n—1Pr(u»Yi-A,uw»YQ--. (9 
Y; = [nf(n— DPX 
Re X = [n| n- 1)]5A, 
p Ug are unit normal variates with correlation p — — 1/(n— 1). From the discussion in 
are negligible if P(A) is 


the previ i 
Bre section it will be clear that the terms not shown in (6) 
ed only to moderate accuracy. For a = 0:05 Table 1 shows P(A) for selected values of 


N<I5 
$25 and for A = 1, 2,3, 4. The last term of (6) was calculated with the aid of tables given 


by K 
. Pearson (1931) and Nicholson (1943). 
Tabl 
el. Probability P(A) of rejecting the largest of n observations by the use of max, — 7 at the 
ons are normal with unit variance, n—1 have 


59 dose 3 
% level of significance when all observatt 


m 
ET wand one has mean pt A 
d 
^ 1 2 3 4 

3 0-216 0-654 0-951 0-999 

4 Opes 0-557 0-902 0-993 

5 a 0-496 0-862 0-987 

6 Dd 0-453 0-829 0-980 

| 7 0-127 0-420 0-802 oara 
| 5 0-120 0-395 0-778 oeng 
9 0-113 0:374 0-758 0.980) 

| 10 0-108 0:357 0-740 0-954 
12 0-710 0-943 

0-101 0-329 

15 0-093 0-300 0-074 0-929 

20 0-085 0-260 0-630 0-909 

25 0-080 0-244 0-598 0.39% 

function asit gives the probability 


Tejecti P(A)- Tt is a power 
ing; ting Hy when H, is in fact true ince P(0) = % 
to be expected from the test 
rning about the ‘pollution’ o 


o end of $3:1. 


P(A)is therefore of interest as an 
when the experimenter is 
f his data by at least one 


* See als 


88 Application to statistics of an elementary theorem in probability 


rogue observation without wishing to identify this rogue. On the other hand, it is nof equal 
to the probability of rightly inferring that there is a rogue observation and identifying it 
correctly, this being the joint probability that the largest observation is a true outlier and 
that it exceeds Y,. However, as A+ 0 this joint probability tends to approximately «/n and 
is therefore not a power function, although possessing desirable properties. It is, moreover, 
difficult to evaluate but will always differ from P(A) by less than z, approaching P(A) (oF 
Pr (w, > Y;—A'))as A increases. 


3. THE MAXIMUM P-RATIO 


Let s?/s? (t = 1 ... n) denote the ratio of two sample variances, which follows an F-distribution 
with v, v degrees of freedom. Then the maximum F-ratio is 52,,. /s?. The importance of this 
statistic in overcoming the effect of ‘selecting’ the largest among a number of F-ratios 19 
well known (see, for example, Pearson & Hartley, 1954, p. 39). The cases v, = 1, % = 2 


have been investigated in detail by Nair (19484) and Finney (1941), respectively, but bY 
different methods from ours. Finney also indicates the solution for other even n, We shall 
apply equation (3) to the problem. 


Take y= 83/8". 
Then Pj, (Y) = Pr( o Y, Yb > r) 


=f Pr (X> Yx, xj » 2Y;z, ..., X> 2Y x) p(x) da, 
where x7 is distributed as X? with v, degrees of freedom, 
2Y, = nY |v, 
P(t) = 2- T3 44v) ghe. (v> 0). 
Writing Q(x) = Pr (3? » x) we have 


and 


Py.) =)" rt ose dx 


eo = Yx 44-1 1) 
E oe reprexee e HE nnde ( 


. 5 
if the v, are even. (7) may be evaluated by termwise integration when the integrand ba 
been multiplied out. 


9-1. Case of equal numerator degrees of freedom 


We consider the following experimental designs of size 1x1: (i) the randomized block . 


experiment with equal numbers of treatments and blocks, ( 


"m he 
i ii) the Latin square and (iii) tl 
Graeco-Latin square. Actually, 


" ie or 
š à the use of the maximum F-ratio is perhaps even ™ an 
important in factorial experiments in which a rather larger number of ‘treatment’ me 


squares is tested against a common error mean square. But the designs chosen will serve . 
illustrate certain interesting points and have the convenience of allowing a fairly systema? 
tabulation. It should also be noted that in the sequel all numerator mean square? a 
treated on terms of equality, although it is often more appropriate to apply the method on y 


to those mean squares corresponding to factors of real interest and not to ‘ extraneo" 
factors such as block effects. 


| 


H. A. Davip 89 


T 2 aoe 
aif putt 22 i mies obtained from (2) and (7), with which the ordered F-ratios 
itp: = nl k Loewen levels F, of the corresponding random F-ratio. As is to 
Type I n aa term r (Fy > Fa) accounts for by far the largest portion of the ‘total 
, viz. 2æ, 3x, 4x, respectively. This term increases with 1 for any given design. 


The oo- A 
values shown are identical, for any t, with the probabilities obtained by assuming 


Table 2. S. . T. a 
p la m Showing the probability, for various designs of size lx l, with which the ordered 
ratio Ky exceeds the upper percentage point F, of the corresponding random F-ratio 


T 
| a = 0-05 | a= 0-01 
: Q) v Pe : E 
| Randomized Latin | E Randomized) Latin | boo 
iK 3 "e i 2 5 atin 
block square square | — block square sgns 
‘i eee | 
3 | p 
1 0:0842 0-0903 = | 00172 | 00183 = 
2 0-0158 0-0424 = 0-0028 0-0084 — 
E i 0:0172 — | = | 00034 = 
5 | | 
1 0-0921 0-1238 01398 | 00191 0-0265 0-0306 
2 0-0079 0.0230 | 00434 | 0-0009 0-0031 0-0071 
3 — | 0-0032 0-0137 ES 0-0003 0-0019 
4 = = 00031 | — | — 0-0004 
7 I 
1 0-0944 91321 | 01623 0-0195 0-0283 0-0359 
2 0-0056 0-0165 0-0315 0-0005 0-0016 0-0036 
3 == 0-0014 (0058 | — 0-0001 0:0004 
4 = | = | 00006 — — 0-0000 
9 | | 
1 0-0953 |  0-1356 0-1706 0.019; | 0.0288 0-0374 
zi 0-0047 | 00136 0-0259 0-0003 0-0011 0-0024 
g = | 0-0008 0:0032 = 0-0000 0-0002 
4 = = 0-0002 | — — 0-0000 
© | 7 
1 0.0075 | 01426 | 01855 0-0199 0-0297 0-0394 
2 0-0025 9.0072 | 00140 | 0-0001 0-0003 0-0006 
3 z | «gu | ee | = 0-0000 0:0000 
4 - = | 00000 — — 0-0000 
| | | 


nde 
Pendence among the variance ratios (Hartley; 1938) and apply to any n-factor design 


m9: : 
hr 3, 4) with large error degrees of freedom. , - 
able 3 we give the upper 5and 1 9, points of the maximum ratio m the above cases 

re is shown the 100z[n % point of the corre- 


ET 

8 aig for 1 = 6, 8, Below each exact figu not 

(1949 ing random F-ratio, this being & simple conservative approximation due to Hartley 

ip.) The not” values for b= 6, 9 WO” in fact inserted by interpolation with the 

deci, ation as an auxiliary function, and may be in error by a few units in the second 

exp al Place, If the error degrees of freedom v are noted it will be found, as might be 
y increases (n constant), 


eo : à 
nq a that Hartley’s approximation increases in accuracy as 
Sn decreases (v constant). The usefulness of the approximation, except for very small v, 


90 Application to statistics of an elementary theorem in probability 


is obvious for the cases treated in Table 3. Moreover, in many other situations (including 
the case of ‘extraneous’ factors mentioned above) the demands made on the approxi- 
mation will be less severe. 

If the approximation is accepted a very simple procedure permits the complete analysis 
of the experiment (Hartley, 1955): 

Test Fy) by referring to the 100a/n level of the corresponding F-ratio. If Fy) is not 
significant stop. if significant test F at the 100zx/(n — 1) level, etc. 


Table 3. Upper 1002 percentage points of the maximum F-ratio followed by the 
corresponding 100 [n % points of F for the n-factor designs of Table 2 


N TN a = 0:05 | a = 0-01 
Ay | 
| 
1 i E 3 4 2 | 3 4 
S s y TES ae =. 
3 967 | 354 | = 24-3 182-0 — 
10605 | 490 an | 263 299-0 — 
| || 
5 | 366 | 44; 586 | 559 7-05 10-2 
373 | 407 649 | 504 7.23 10:9 
6 $00 | 355 | (4-09)* 4440 5:10 (6:10)* 
3-13 3-04 |  (431)* 4-43 5-17 (6-26)* 
7 2-76 308 | 34l 3-78 4-21 4-71 
2-78 3-13 3-51 3-79 4-24 4-76 
8 2-18 279 | 301 3-39 3-08 4-00 
2-20 282 | 3-07 3:39 3-70 4-03 
9 2-38 2-60 2-76 3-10 3-34 3-56 
2-40 2-62 2-80 3-10 3-35 3-58 
je zal 


] i i po” 
* These values must, of course, be interpreted without reference to an experimental design as 
6 x 6 Graeco-Latin square exists. 


3:2. Generalization to unequal numerator degrees of freedom 


The Finax, test is intuitively at its best when all numerator degrees of freedom v, are equi, 
In the more general case it is still possible to determine exact significance levels of Fmax: cm 
means of (7) provided all the v, are even. However, the probability with which the tth faca (d 
is significant by chance is then no longer independent of t but; larger for the smaller Vr T : 
difficulty may be overcome by having, for any given design, a specific significance i 


H(«), say, for the tth F-ratio F,, subject to the following conditions: 


(i) probability that F,exceeds F(x) is independent of t; 
(ii) probability that at least one F,exceeds F(a) is a. 


These conditions result in a set of equations which may be written down by means of 
and (7) and solved numerically. 


H. A. DAVID 91 


Exam 
«ample. In the case of a v, +1 by v+ 1 randomized block experiment F(a), F;(x) are 


the solutions of 
(i) Pr(si/s* » F) = Pr (83/8 > F), 


(ii) 2 Pr (s3/s? >.) — Pr (sî/s° > F, s3/s* > Fo) = o, 


‘a hal and ip mean squares based On v: Va and v, v, degrees of freedom, respectively. 
Bu and Vo e and a = 0:05 we dbiein F,(0-05) = 3-07, F,(0-05) = 4-58. These values 
hien npared with the 21 % significance levels of the corresponding F-ratios, 3-12, 4-69. 
rim quate tabulation would, of course, be feasible only in certain simple cases. But in 
Dno: p ricum situations it will be sufficient to test the largest variance ratio against the 
Eom om. Y% point of the appropriate F-ratio. When special accuracy is desired the 
ximate values will help in the numerical solution for the somewhat smaller exact 


e i 
Percentage points. Reference should again be made to Hartley (1955). 


t draft of this 


Iam indebted to the referee for a number of valuable comments on the firs 
tation of the 


Paper and to Miss B. C. Halliburton of the C.S.LR.O. for her careful compu 


Various tables, 
C REFERENCES 
OC: T B à 
nd W. G. (1941). The distribution of the largest of a set of estimated variances as a fraction of 
Ferre eir total. Ann. Eugen., Lond., 11, 47. 
md bi (1950). 4n Introduction to Probability Theory and its Applications. New York: John Wiley 
ons, á 
INNE " 
on D. J. (1941). The joint distribution of variance ratios based on a common error mean square, 
ieee Eugen., Lond., 11, 136. 
Gin R. A. (1929). Tests of significance in harmonie analysis. 
ur F. E. (1950). Sample criteria for testing outlying obser 
RIN, M., GREENHOUSE, S. W., CORNFIELD, J. & ZALOKAR, 
Ne for the studentized maximum absolute deviate in norma 
H. » 185. 
1 eal H. O. (1938). Studentization and large-sample theory. J. R. Statist. Soc. Suppl. 5, 80. 
EUM H. O. (1949). Tests of significance in harmonic analysis. Biometrika, 36, 194. 
LEY, H. O, (1955). Some recent developments in analysis of variance. Comm. Pure and Appl. 


Mo Math. 8, 47, 
Med A. T. (1935). The distributio: 
Nam, K ple mean in samples of n from a norm 
Bi - R. (19482). The studentized form of the extrem 
Nam tometrika, 35, 16. 
om R. (1948). The distribution of the extreme deviate from the sample 
lonorsr Biometrika, 35, 118. | 
Bang SON, C. (1943). The probability integral for two variables, Bic 
p tien E. S. & Cxanpra SEKAR, C. (1936). The cp of statistici 
"Y i i 7 ika, 28, 308. 
E aed . Biometrika, ra 
ARSON, E, x de O (1954). ‘Biometrika Tables for Statisticians, 1. Cambridge 
Po ty Press. É i: 
N, K, (1931). Tables for Statisticians and Biometricians, 


Proc. Roy. Soc. A, 125, 54. 

vations. Ann. Math. Statist. 21, 27. 
Jutta (1955). Tables of percentage 
lsamples. J. Amer. Statist. Ass. 


extreme observation and the 


27, 466. 
st in the analysis of variance. 


n of the difference between the 
al universe. Biometrika, 
e mean square te: 


mean and its studentized 


b Biometrika, 33, 59. 
al tools and a criterion for the 


2. Cambridge University Press. 


[ 92] 


xX PROBABILITIES FOR LARGE NUMBERS OF DEGREES 
OF FREEDOM 


By JOHN WISHART 
Statistical Laboratory, University of Cambridge 


Probability values for x? are given in Pearson & Hartley (1954, Table 7) for y? up to 120 and 
for degrees of freedom (v) not greater than 70. This means that the complete range is not 
covered beyond about x? = 40, and no values are available for really high X2. Casting about 
for a suitable expansion which would make tabulation in this range feasible, the author 
first had a look at Karl Pearson's modal expansion in terms of incomplete normal moment 
functions (Pearson, 1922, p. xix), a formula which may alternatively be written in terms of 
certain x? probabilities, or as a polynomial in a related variate z', combined with chosen 
values of the normal probability integral and ordinate corresponding to z' (P(X) and Z(X) 
in the notation of Pearson & Hartley). Karl Pearson calls attention to the slowness of 
convergence of his formula, and evidently did not recommend it highly for use beyond the 
tabulated range of the Incomplete Gamma-function Table. 

Going back to first principles, we shall find it convenient to adopt the Pearson-Hartley 
notation of using c for 4v. The frequency function of y = Xlv = My?lc is 

c 

Te ev (0<y <0), 
from which that of z = 4Iny is 


2c¢ 2c^e-c 22)? (22) 
Ear Qe — e2) — {(2z)? | (22) 
Tie xP {el z—e%)} = Te) exp[-of G+ GP... (—w<z<oo). (1) 
Now write x = 2z,/e = A (3v) In (x?/v). This leads to the frequency function for x 
(277) ct- e~e e~z? a3 4 5 6 
fü al a & x g 
) T() — JQm)^P- stes + arc* graistgrat 
edt x3  a9—3z* Ba9— 4537.L54a9 — Dal? 10 X 6 
= t + ae a 94x" | da! — 902? + 35128 — 2162 P 
° J(2m)| 6003 ^ 3c 648009 — * 155,5200? z e (?) 
expanding as far as terms in c-?, to this point the value of a being 


1 1 

1 — 12c + 288c? . 

The range of x is from —co to +, and a = 0, while not the mean of 
X? = v, ie. to the mean of X?. Effectively we are transforming 
X? to its mean, raised to the power of its standard deviation, b 
To obtain X? probabilities we require to integrate f(x) da from 
positive or negative, or from X to co. The answer can be given 
may be put in terms of the incomplete normal moment functio 


x, corresponds to 
X? by replacing the ratio of 
y e. 

—o to X, where X may be 
in more than one form. It 
ns 


I (X) = l * s e-3 da 
5 A(27)J d 


4 
i 


Joun WISHART 93 


PR A^ À i aa 
cording as n is even or odd, for comparison with Karl Pearson's modal expansion, We 


then get 
Formula A. X = Jcln (2/20): 
ma(X) 


z E oT. 11 1 
I f(x)dx = i! — iz; * 388i |0- d45 ez Gm X) -$m,00] 


l y 
= gs mE )—4on,(X) +m) 


+ Ei )- 3àm (X) + mX) - agmo(X »| (3) 


seven decimals in the old Tables 


to m,a are given to 
) is readily available, either in the 


as far as terms in c-?. Values of m, 
p(X)-05 and P(X 


(Pearson, 1914, 1931), while o(X) = 
old Tables or in the new Biometrika Tables. 
From this formula we can derive a very useful expression for the chance of X? being less 
0 
than, orat most equal to, its degrees of freedom v. For this will be | f(x) dx, obtained from 
; -o 


Formula A by putting « = —%: and noting that 


ma a( 99) = —(25)3 and ‘Ms,(—00) = 09. 
Formula A’. 
n > ayp T Lad ad stet gn 
[fe finem ( =i" zsa) [5 + 7am) 308 * 3ic J@n) 13508 5760 


1 1 1 ) - 
loea 1800)" (4) 
This formula will give seven-decimal accuracy for c of t 
the range of the Incomplete Gamma-function Tables 
Adding or subtracting (3) and (4) gives the required x” 
Alternatively, we may write in (3) 

JG) mas al X = p(X?|2r* 2). 

2m3(X) = P(X3|2r- 1) 
ability that y? does not exceed 
d to five decimals by subtr 


les, Table 7. . 
m —oo to X, we may get the probability 


of X, involving 


he order of 50 or above, i.e. beyond 
(the p of that table is our c— 1). 
probabilities. 


X2, for v degrees of freedom. 


Where P(X? the prob 
Lis scopum. oT acting from unity the x? 


These probabilities may be obtaine 
Probabilities given in the Biomelrika Tab 
By integrating the terms of (2) by parts fro a 
that æ does not exceed X in the form of an expansion in powers 
Le 4de 
D) = e? dz 
PS Temi) —e 
1 E 
re EEA eg. 
Du y (27) 
sult in the form of 


Multiplying in by the outside factor we get our re 


Formula B. X = Jeln (x?/(2c))+ 
F ipis DiRe 53* 5X6 1 249 OX? +12 
LA dar = P(X) + A(X) ges oe 6,480015 
gxn —35X°+ 36X7- 144X5—1803*—840X| (5) 
Mom A iih 


nside the bracket) is — 25/(2.016c). 


* An additional term (ir 


94 x? Probabilities for large numbers of degrees of freedom 

Now the well-known Cornish—Fisher expansion (Cornish & Fisher, 1937) applies to the 
general z function. We may adapt this to our special case by writing Ny = y = 2e, ny = ©. 
In the formulae for a, b, -..,f on p. 30-13 of the reference, we have à = © = (2c) 
then substitute in the formulae on p. 30:9 we get Formula B above, Corni 
z and £ being replaced by P(X), Z(X) and X respectively. 
obtained is not a new one, but is a special case of a direct Cornish—Fisher expansion in which 
nz is allowed to go to infinity. Our derivation of this expansion from the frequency function 
of xX? by a few lines of algebra is, however, believed to be new. 

The best-known form of the Cornish 
which determines z, in terms of X, where the chance that z is les 
% is equal to P(X). A chosen value of P(X) determines X from 
We can therefore get the inverse form to Formula B by put 


formulae on p. 30-14 of the reference cited, but note that in the 
read 6°. With z = } In (X?/v) we get 


7l and if we 
sh & Fisher's p, 
Thus the expansion we have 


-Fisher expansion is the inverse one, namely, a formula 


s than, or at most equal to, 
normal probability tables. 
ting à = ø = (2c)-1 in the 
penultimate line à* should 


Formula B'. 
2QX X42, X'.5X 6X44 59X2 +58 9X5 + 239X384 599 X 6) 
“0 2005 ^12; " 355 3,2402 + — "T6085 ——- ( 


Corresponding to a chosen value of the probability that 
the corresponding 3? as ve? from tables of natural logar. 
A footnote to Table $ in the Biometrika Tables 
approximations for v > 100. One which is practically as good as the Wilsor 
root approximation, and better than Fisher's V(2x2), can be obtained fr. 
terms of Formula B' by putting In (x?/v) = XA(2]v) — (X?4- 2] 


mation to the probabilities beyond the range of Table 7 m 
terms of (5). 


X? be not exceeded, we then obtain 
ithms. 


(Pearson & Hartley, 1954) gives two 
n—Hilferty cube- 
om the first tw? 
(3v). Equally, a fair approxi- 
ay be got from the first two 
Examples. 


(1) v = 98, c = 49. Let X* = 98. Use Formula A’. 


P(y?«98) = ataa (+ sem) 
= 0-5189 994. 
From the Incomplete Gamma-function Tables 
u= X(x) — 7, p=48, T(u,p) = 0-5189 993. 
(2) v = 100, c = 50. Let y? — 124-342. Use Formula B. 
z = $ ln 1-24342 = 0-108933, X = 2z Jc = 1-5405 453, 
P(x? < 124-342) = 0:9382 8625 


+ 1255249 
si 85353 
4 1346 
» 98 


0-9499 996. 


Biometrika Tables (Table 8) give 0-05 point for x” as 124-349, 


Joun WISHART 95 


) = P(x? > x) = 005. Use Formula B'. 
X = 1-64485363 
Z= 0-1163 0872 


(3) v = 100, c = 50. Let Q(x 


— 784251 
E 49790 
= 3229 
¥ 155 ` 


01089333 giving ag = 1243421. 
295. Use Formula B. 


(4) v = 100, c = 50. Let x? = 77-9 
= — 17632811, 


z= — 0:1246 828, X 


+ 
+ 2132 


YT 
0-0500004 


nt for Y? as 77-9295. 


give 0-95 poi 
= 0-05. Use Formula B'. 


Ps XD 
0-1163 0872 
784251 
49790 

3229 

155 


Biometrika Tables (Table 8) 
(B) v = 100, c = 50. Let P(x) = 


ner 


c 
01240830 giving y2 = 77-9295. 
(xm urge to undertake this investigation might have pon lacking had Slutskii's Tables 
qe been available earlier to the author. For Slutskii tabulates the x? probabilities 
} p oum to t= JG» — (Qv). first for v — 6 to 32 (Table III) and secondly for 
: |v) = 0 to 0-25, i.e. for v = down to 32 (Table IV). The tables are given to five decimal 
En and Table IV, in particular, js a very useful supplement to Pearson & Hartley’s 
pane 7. Formulae A and P above may; nevertheless, be useful for tabulating X?|v pro- 
abilities to considerable accuracy; OF in abbreviated form, for obtaining approximate 
UR without the use of tables. They are also interesting in their own right. They are 
poo cases of a general expansion which the author has since worked out, and which will 
€ given in a separate paper- 
REFERENCES 
nt. rac AE e to Mathematical 
for Statisticians, 1. Cambridge University 


(1937). Rev. Inst. ir 
aper 30. 1950. Londo 
. (1954). Biometrika Tables, 


icians and Biometricians. Part I (Table IX), Part II 
E T 


C 

ba E. A. & Fisner, R. A- 
ms tatistics, by R. A. Fisher, P 
son, E. S. & Hartiey, H. O 


uL Press. 

o] E (1914, 1931). Tables for Statist 

Puede o cen Biometrika Oor Reissued 1934 

Ds - Tabl 5 tion. eissue , 19 : : 
3 Pris es of the Incomplete p-function. 46. Cambridge University 
rex, E. E. (1950). Tabl à l ‘i 
? P es the lation 9 the Incomplete T function : 

for the Calcu f Nauk SSSR. and the x? Probability 


Function. Moscow and Leningrad: Isdatelstvo ‘Akademii 


[ 96 ] 


THE SAMPLING DISTRIBUTION OF A 
MAXIMUM-LIKELIHOOD ESTIMATE 


By J. B. S. HALDANE axp SHEILA MAYNARD SMITH 


University College London 


The method of maximum likelihood was used by Edgeworth (1908) and others. Since 
Fisher’s (1921) paper, which gave a general expression for the variance of an estimate by it, 
it has been very widely used. However, no serious attempt to determine the form of its 
sampling distribution when the number in the sample is not very large was made in the 
next thirty years, though Hotelling (1930) and others improved the rigour of earlier proofs. 
Haldane (19534) gave its bias, but gave an incorrect expression for its third moment. These 
were, however, obtained as Special cases of a more general theorem. The paper is, moreover. 
hard to follow owing to numerous misprints, and several errors, some, at least, of which are 
here corrected. In this paper we obtain the first four moments of the distribution by a more 
direct approach. 

Throughout we consider the estimation of a single unknown parameter &. Let x be the 


maximum-likelihood estimate of &, based on a sample of N members. Before considering 
samples from a continuous distribution, we supp 


classified into a finite or denumerable set of class 
sample falling into the rth class is n 


The permissible range of values of 
over this range, which may consist of several discrete p 


arts, every f(£) is regular. I" 
particular, we assume that, even at the boundaries of this i 


range, £ remains finite. 


N; 
£e) = DE, fe) = i, then w= 
2 


Hence &(a) is infinite, since Ny has a finite probability of being zero. In such estimations th® 


maximum-likelihood estimate is seriously biased even if cases where one class is empty 97€ 
excluded. Thus in the above example z still has a bias TNGÉU +&)+0(N-2) when case? 
with n, = 0 are omitted, whilst z' = n,/ (n5 +1) has a bias tending to zero more rapidly tha? 
any negative power of N - This criticism applies to some other efficient estimators, for 


example, the ‘product method’ which Fisher (1954) recommends for linkage estimation- 
TE 


16 = (^) &a- ar, 
r| (mN), and its sampling distribution is binomial, with cumulants 


K=€ k= NE - £), Ks = N-E(1 -&uü — 2£), etc. 
In this case, the maximum-likeliho 
a Poisson distri bution, or for a dist 
We conjecture that it is biased in 


then z = Y rn, 
T 


T 
od estimate is absolutely unbiased, The same is true a 
ribution in which h-+ ké is substituted for £ in f,(8) abo pe 
all other cases except for special values of £ It Lin 


p 


J. B. S. HALDANE AND SHEILA MAYNARD SMITH 97 


ro. that in the important case where expectations are linear functions of £, that is to say 
(E) = h, + c£, the bias is of order N~, and thus usually negligible, whereas in other ties 


it tends to zero with N-1. 


^ SYMBOLISM AND PRELIMINARY LEMMATA 
Let " " m i 
JE = an FE =bn FE =en f"B-d. JEE) = e 


Since Xf) =1 for all u, it follows that Da, = 1, X5, = Xie, etc. = 0. Clearly a, b, etc 
f T * dio a 


are constants over any series of samples from a given population, though their values may 
have to be estimated. " 
Let x = £ 4- y, so that y is the error of the estimate. 


Let n, = N(a,--z,). 


Te HUM m " 
t A, = Xa. B; = Dar "bre, €, = Xia; ie, 
' r r 
D; = Xa; bid, S; = Z hia, where h, is arbitrary. 
y 


r 

L Apat -ipi -ipi- 

et a; ya; Os "25 Pi = Xa. ibi C Ep, ô; = Xa. ibi 1d,2,, 
r r 


Li 
9 = auf À= 2n 99s 


aning is clear. Thus Ya-bz means Y; a7 15,2, and so on. 
T 


Suffixes will be dropped when the me 


It is at once obvious from Taylor's theorem that 


f(x) = a b,y + 3e, 9? + idy? + 7] i 
Fæ) = be y + Ma + de T q 
. The symbolism is that of Haldane (19534). The quantities 4;,etc., are connected by 
Simple relationships such as 
d i ; d 
gA = —iAgua t (0+ 1) Bj. d£ 
th 4, and C; are positive wh 
E is to say, for values of 
Te usually large under the sà 


B, = -iBi iC Dy. 


omes large when any a, is small, or any b, large, 
£ for which any FAS) is small, or is changing rapidly. B; and D; 
me conditions, and when a first or second derivative of any 
h a term as Ay 34,, which occurs in the expression for the 


en i is odd. A; bec 


J(8) is changi i 
r han, dly. Suc 
ging rapi y if any f,(£) is small, and so on. 


f x, is usually large 1 


Sk : B . 
€wness of the distribution 0 
s for such expectation 


In order to obtain expression 
pen expressions for the expecta 
cl Xk,z, where h, and k, are co 
T h,z,. Then if S; = Shia, it is easy 


tri . r r ; 
‘bution, that the expectation of Sh, 2, i$ Zero, andi 
r 


s as & (a3) and &(o$0) we require 
tions of powers and products such as (XA,2,)! and 
First consider the sampling distribution of 


nstants. 
om the moments of the multinomial dis- 


to show, fr 
ts first few moments are: 


Jig = N83 — 81); 

p N-8,— 38,82 + 251) " 
jac a (8, — Stt N38, 45:8, A 12818,— 651). (2) 
js = 1088,82) (99738192 t 283) + O(N), 


i= 15-8, — 532+ OWN )- 
Biom. 43 


98 The sampling distribution of a maximum-likelihood estimate 
For example, 
é(2) = N-*a,(1— a,) (1 — 2a,), 
6(2:2,) = —N-a,a,(1—2a,), 
E (2,22) = 2N—a,a,q. 
(ZA, z,)3 = XAhizi--3Xhlh,2:z,4- GLA, h,h,z,z,,. 
So &[(Zh,z,)3] = N-*[DA3(a, — 3a2 + 22) + 3EAZK,( —a,a,+ 22a) + 12£h, h h,a,a,a,] 
= N-XA$a,—3Xh,a,Xh1a,4- 2(Eh,a,*] 
= N-*(8,— 38, 8, + 281). 9 
The leading terms of y, #5 and jz, can be derived directly from z and jj. From (2) we 
can at once obtain the expectations of powers. For 
æ, = Xap5bz, so h,=az1b,, 
and 8,—0, S= Zayb} = A, S,— A, S= As, ‘ 
So from (2), E&(a4) = 3N-242 + N-(4,- 342). 


To obtain expectations of products, the expressions (2) must be expanded if necessary: 
For example 


Hs = 2N-(58,8, — 5818, — 158,93 + 25828, — 1098) + O(N-3). 
In é[(=h,z,)4 (Zi, 2,)] one of the five h’s in turn is replaced by k,. Thus 
E((Zh,2,)* (Zk, z,)] = 2N- [3(8, — 82) (Sha — So Zka) 
 2(85 — 68,8, + 582) (Ehka — S, Dka)] + O(N-). 
From such expressions we immediately evaluate expectations of products. Forexample, i? 


a0 = at(a,—f,), h=ab, k= a-b? — ale. 


So 8,70, S,=4, S,—4, Ehka=4,—B, XMka- A,—B, Xka- A, 
The required expectations are as follows: 


ej = N-1A,, &(a,0) = N3(A,— B,), 
Eli) = NA, &(c20) = N-(A,— B,— A9), 
(o4) = 3N-242 + N(A,— 343), 
&( 30) = 3N-24,(A,— B,) + O(N-), 
4 (32) = 3N4,24, 3B, + Dj) + O(N), 
6 (230?) = N—2(4,— B? e A (A5 — 2B, 4-0.) — A3] -O(N 3), eM 
Elai) = 10N-*A, A, + O(N-4), l 
4 (010) = 2N-3[24,(4, — B.) --34,(4, — B,) 343] J- O(N-5), 
Elai) = 15N-3A3--O(N-1), 
6 (240) = 15N-243(4, — B.) -O(N-), 
6 (a1) = 15N-341(24,— 3B, +D;)+O(N-4), 
(a1 0") = 3N-2A,[4(4, — By - A (45 — 2B, 4-0) — A3] -O(N 9), 


Haldane (1953a) gave an incorrect expression for E[(£hz)? (Zkz)?] in his equation (4). His 


equation (6) should read 


| J. B. S. HALDANE AND SHEILA MAYNARD SMITH 99 
li 


-^ 
vU 


E (ei) = N24, A,Hy). 


CALCULATION OF THE MOMENTS AND CUMULANTS 
The maximum-likelihood estimate is a root of 


Dn, file) Lf GT" = 0. 


as to which root is the estimator. Substituting from (1), 


(4) 


There is rarely if ever any doubt 
provided that a, is not zero, and summing, we obtain 
ay — (A, +0) y+ 324, 3B, + ^ — 4(6A3— 12B,+30,+4D,) y+... =0. 
These terms are sufficient to calculate the fourth moment of y to order N—. The first term 
tends to zero with N-?, while the coefficients of powers of y tend to constant values with 
large N. So when N is large we may invert this series, which is the expansion of 


1 
xl ern), 


by Lagrange’s theorem. 
We thus find 


y = Ara + Ar [A — $B) 4-417] 94 
T A; *[Q(45 — 3B,)?—Ay(As— 2By+ 15 +2D,)} a4 


—3A,(045 381) %10 + 442o,A-- A30?] + O(N). (5) 


ons (3) the moments of the distribution of 


ental equation and equati 
d two terms in the expansion of us. 


From this fundam 
Y are calculated. In order to evaluate K, We nee 


Sly) = -4N -4r*B, FON). 


This is the bias of a: 
bly?) = NA+ N24 - 4 4. 18B3 + A ds - B; D) - 41] +O(N~). 


Subtracting [6 (y). 
NAA NA: 


al — A34- 5B As (4s — B,— D) — A3] - 0(N). 


t of information about & per individual in the sample: 


4, 88) +O). 


Hy = 
Thus A, is the amoun 
Ely?) = NA: 


Subtracting 36 (y) Ey?) — 216 GF. 


Li Ha = N-2Az3(A,—3B,) + ON); 3 
(i e 2 n 105p? + A(7À —6B,— 10D,) — 941] 
(^) = ay-24rt4 N-SAT- 04) 1445 DHT eo ee S 
Subtracting 4&(y) EY?) — 6&(y2) [EWP + peor ‘le hat mints 
| 1 SORE MRT TRF tee aes T OUT M 
| Me 
Ma 3p 
m 345) -0(N7)). 
- NAA: 128,4, - 2B) + 44s 1D? 341] ( » 


100 The sampling distribution of a maximum-likelihood estimate 


Thus the first four cumulants of the distribution of x are: 


Kı = §—-3N-147°B, + O(N), ) 
Ko = NAT + NAT'L- A3 +4B}+ A (4a -— By—D,)— 4}]+0(N-3), (6) 
K = N-*A;9(A,— 3B,) + O(N 3), 
Ky = N-34;1*[ —12B,(A5 — 224) + A (A45 — 4D) — 343] -O(N1), | 
whence y; = N-At79(A, — 3B,) - O(NH), 


ys = N-1Az*[ - 128,(A4 2B) + A (45 — 4D,) — 343] + O(N 2). 


If every f,(£) is linear, that is to say, of the form £, 4- 0, £, then every c, and d,, and hence 
every B;, C; and D; vanishes, and the maximum-likelihood estimate is almost unbiased. It 
has, however, a bias of order N-?, evaluated by Haldane (19534) and given below. This can 
perhaps always be neglected. The cumulants of the distribution of x are: 


Ky = + NAT (A3 — 24,45 A5 4- A344) -O(N73), 
Ka = NCAT- NAGUN (41- A, Ay + A3) + O(N), | 


7 
Ks = N“*Ay*A, + O(N), a 
Ky = NAT (A; — 347) -O(N-3), 
whence ism N-HAL34, - O(N-), 


Ya = N-(4524,— 8) -O(N-9). 


Thus y, is positive near the limits of the range of £, where at least one value of a, is small, 
and hence 4,74, is large. It may be negative in the middle of the range. 

Now in practice the constants A; B; ete., are unknown. They must be estimated from 3; 
the estimate of £. If we put A1 for the maximum-likelihood estimate of A,, namely, 


EUG) LLC, 
Ay = A= (45 - 2B) y + (45 5B, - C, 4- D) f? + ZI 
64) = Ay + NC APLAUB A, - B) + 4, - $B, 01€ D,] -O(N9, 


and similar expressions can be found for other estimates, Thus (6) and (7) remain true whe? 


A, is substituted for A; and so on, except that the second term in the sampling variance has 
no validity. 


We may also use the fact that fA) = N-n,+ O(N-1) 


then 
So 


and use the approximations 
Ay = Nnr ilf), Bi = Nine fia) fia), 
Ci = NEn f(a) See, Dy = NEn if TA (ae) 
for A;, Bj, C; and D, in expressions (6) and (7) without introducing serious errors provided 
no n, is small. 
We have so far neglected the possibility that any a, is zero. This can only be the case if, 


in equation (4), at least one n, is zero, though it may be necessary that several n, S shou 
vanish. Thus if 

Ait) =a, fax) = 27, f(a) = 1—a—2?, 
equation (4) gives 


(n4 + 2na + 2n3) x? + (n4 + 2n, n3) x — (n4 + 2n) = 0, 


and since no n, can be negative, x can only be zero if n, = n, = 0. 


* 


, consider a continuous variate 


J. B. S. HALDANE AND SHEILA MAYNARD SMITH 101 


If is zer i i 
ee ma CV Nadine 
eee nn oe ae 2 Ue n Oy ipe o say, the estimate is absolutely 

à 1 , A, is infinite, that is to say, v is not only an unbiased estimate. but 
a precise estimate. However, we can never know that this is the case on the basis of a finite 
See The numbers A,, B; etc., all become infinite if any a, vanishes, and the estimates 
9 them become meaningless if any f(x) is zero, and unreliable if it is sufficiently small. This 
is of course paralleled in the case of simple sampling when no members of a sample POSE 
some attribute. This problem has been discussed since the time of Laplace, and a further 
discussion would be out of place here. To sum up, if there is a set of one or more values of 
n, whose expectation is zero for some value of £, then ifall these values are zero the expressions 
found are inapplicable, and if their total is small (say less than 5) they become very 


inaccurate. 


So far we have supposed tha of a discontinuous variate. Let us now 


tthe sample is a sample 
p, whose distribution is given by: 


dF = f(p. E) dp. 
nown parameter to be estimated. Suppose that 


and č an unk 
..; py, then the maximum- 


where f is a known function, 
assumed the values pi; o; -+-+ Pr - 


in a sample of W members p has 
likelihood estimator is N 

X ft. Df P 2 = 9. 

r=1 

as 


and we have only to put 
^ C o 
d, = as (Pr t), 


^ d e t- e " 
a, = f( P, x), by = a 9» 9 = aged (Pr 9. 
and (7). It follows that the maximum- 


able to use formulae (6) 
y unbiased, and that if 


and A, = Lath, etc., to be 
a normal distribution is absolutel 


A ER estimate of the mean of 
P, §)i 

auipaf iis form £f (9) 01. Bs) 
ctions, the maximum-likelihood estimate of £ 
all, we should group 
the rth interval. 


funi 
tice, unless N were quite sm 
the central value of p in 


Where f. (p) and fo(p) are known frequency 
as à bias tending to zero with N-?. In prac 
our data, and f, (v) would be f(D), where p, 15 


Discussion 


). In his equations (12) the expression for the 


]dane (19534 
d estimate should read 


oN [Ay —3B,)+ O(N-*). 


s first correct an error of Hal 
hird cumulant of the generalize 


Kg = 
Since x, is independent of k the skewness of the distribution of the minimum discrepancy 
Stimate (I; = +1), or of the estimate found by putting k = 2 so that 
8] = 0, 


S [n (0+ DÍ: “(a) {f,(@)} 
7 
hich is nearly equivalent to minimizing X^: 15 the same as the skewness of the maximum 


lika 
"slihood estimate. à 
T" he statistical importance of a knowledge of th 
amples, Let us consider the case where 
fila) = 1042» fin-ia-m. fale) = i 


e distribution of x can best be judged from 


102 The sampling distribution of a maximum-likelihood estimate 


which arises in linkage estimation from F, data, and has been extensively discussed by 
Fisher (1954) and Haldane (19534). The maximum-likelihood solution is the positive root of 
Na? — (n, — 2n, —n,) c — 2n, = 0, an alternative efficient estimate by the method of minimum 
discrepancy (Haldane, 19534) being 
2(n44- 1) (2n, — na 4- 1) 3 
(m+ 1) (n - 1) + 4(n, +1) (mg + 1) + (no+ 1) (n3 +1)" 


Here B;. C,, etc., are zero and we can use (7): 


A, = (1+ 2x) [æ(1 — x) (2+2), A, = }4(2— 2x — 52? — 42?) [z(1 — x) (24+2)]-2, 
A, = 1(4— 6x— 32? + 1423 + 1221 4- 62) [z(1 — x) (2 42)]73. 
Hence the bias of x is 72a(2+2) (1 — x) (1-4-22)-? N-?, which becomes + 144xN-? when 
wis very small, and +8(1—a) N-? when g is nearly unity. It is always negligible. 
The variance, as is well known, is 2x(1—x) (2 - x) [(1 + 2x) N]7 and 
Yı = (2— 2a — 82? — 42?) [3&(1 — x) (1 + 2x)? (2 +x) NJA, 
Yo = (8— 18x — 272? + 1943+ A81 + 2425) [z(1 —2) (2 E25) (1+ 2e)2]-1 N-1, 


Thus y, changes sign when a = 0-418. y, is negative if v lies between 0-353 and 0-611. Thus 
over this range the maximum-likelihood estimate is rather more precise than it would be 
were it normally distributed. When x or 1—z is small, it is less precise. When v = j and 
N = 100, y, = 0:0474 and y, = — 0-0065. 

As a non-linear example consider the case where, of N o 
mortality, n, have died during the first of three equal interval 
during the third, while n, survive. If £ be the probability of s 


rganisms subject to constant 
S, na during the second, and ns 
urviving through an interval, 
Áe)-1-z fax) = e(1—-2), fyz) = (1—2), f(a) = z. 
Thus a = — Mat ns 3n, ; 
Ny + 2n4 + 3n4 + n4 
A, = (1+a+22) [z(1 —2)]71, 4, = (12-2z 4-232— 823) [x(1 — x:)]7?, 
As = (1-- 8x -- 952 — 5833. 43s')[z(1—2)]9. B, = 2(1 + 2x) [e(1 — z)]7, 
B, = ?(8-- 12 — 1322) [z(1—2)]?, D, = 6fa(1 mu 
Thus the bias of z is —x(1—2) (1 + 22) 
variance z(1— x) (14-2 22)-1 NA, Th 


standard error 0-0378, Tt is bare 
error. 


(1+2+22)-2 N-1, which is always negative, and the 
usif = $ and N = 100, the bias is — 0-00163 and the 
ly worth while to correct for a bias of 4 % of the standard 


Yi = [Wa(1—2)]-4 (1+a+2?)-1 (1 —4a— 432 + 4x3), 


which changes sign when z — 0-214, and is only —0:13 when x = $ and N = 100. 


Ya = N7Q x +x?) [a1 —2)} 


When z is small y, approximates to (Nx), and when 1 
€ = 5, y, = 1039/(343N) FON $). Thus when z = 1 and we ru 
Haldane (19535) consi T ja Yo = + 0-0303. 


parameters, 


v- 


kh 


1,,,222, 308. 


J. B. S. HALDANE AND SHEILA MAYNARD SMITH 103 


heir variances and covariance. The method used was 


e rmi 
onfirming the known expressions fort 
er. The calculation of the higher cumulants of the 


ese z that employed in this pap 
Fisher rena of estimates is clearly, however, Very complicated. 
Tkelikesd = howed that if ġisany regular monotonie function, and x is the maximum- 
ate of £. d(x) is the maximum-likelihood estimate of $(&). We can now ask 
almost unbiased estimate of J(&)? 


the fur : 
m question: For what functions is f(x) an 
e (1953a) defined an almost unbiased estimate as one whose bias tends to zero with 


N-2 
or more rapi : 
more rapidly as V increases: 


g(a) = E+) 
= oë) +y% (E) + wo Gt e 


So 
é(9()] = 9(5) + o'(E EU) + 19" ( 697) * 
-$(0- IN3A;B, QE) 4w-44z"9"(6) +O(N~). 


So the condition is 


$8) _ Az By. 
gu e^ 
Hence oë) - [ew [fanaa dé. 


For à : : 
r example, in the linear case here considered 2 is an almost unbiased estimate of £. In 


the ; 
non-linear case 


AB = 


NS so e TEE E+E 


When 
€ E and F are any constants. 
estimates in the branches of biology with 


"ul : Shall briefly discuss the valu Plu z qiu 
estim we ate familiar. If the estimate, , stands by itself, a maximum! elihood 
Eu ate is probably as useful as T he same is true when a function of the 
Jii will later be tabulated. the estimate is to be one of a large 
B cum, say gene frequencies in po or relative viabilities of phenotypes, it is 
x esirable that the estimates should be consistently high or low, on the average, since 
mean or median based on 2 la will have a bias which in some 

. It is therefore desirable 


Case. 
8 could be of the order of ma 


ab est; 
estimates which are subsequently to 


SUMMARY 
likelihood estimate. The 


mulants of a maximum 
ple number is of the order of 100. 


first four € 
s, when the san 


Ex 

s 

"pressions are given for the 
ch estimate 


las ig 5 
only negligible in some SU 
REFERENCES 


Ep 

SEWor ” 
= ble error of frequency à m. 
un da es Det al foundations of theoretical statistics. 


ISHER 
, R. A. (1921). On the mathematic: 


nstants. J. R. Statist. Soc. 71, 381. 
Phil. Trans. A, 


12th ed. London: Oliver and Boyd. 
ter. Bull. Int. Stat. Inst. 33, 231. 
sample. Sankhy@, 12, 313. 

statistics. Trans. Amer 


rch Workers, 


HER, R. A Method: Bese 
Hay. , R. A. (1954). Statistica X s for m 
à Chose : of efficient estimates of a parame 
eters from & 


DA’ 
Patna, oes (19532). A class = Be 
; J. B. S. 53b). timation $ EE LEE 3 
itme, H. dU hee sis and ultimate distribution of optimum 
ath. Soc. 32, 847. 


[ 104 ] 


TESTS FOR RANDOMNESS OF POINTS ON A LINE 


By D. E. BARTON axb F. N. DAVID 
University College London 


l. In previous papers (1956 a, b) we have discussed the distribution of various functions 
of ordered intervals any of which might be used as a test for the randomness of points on 
aline. In this present note we proceed a stage further and discuss approximate distributions 
of sums of ordered and unordered intervals. These distributions can be used to test the 
randomness of points on a line or as bivariate tests for randomness. The power function of 
any, or all, of these tests with respect to a specified alternative can be obtained to a similar 
degree of approximation and we propose to describe these in a further note. 

2. Itis possible to rank intervals along a line according to their position from one end 
or according to their magnitude. We assume that on a line of unit length, (n — 1) points are 
dropped randomly giving nrandom intervals. Let thepoints be dropped at distances {x} from 
one end (i = 1,2,...(n— 1)), with 

© QMX... Sa, 
Write d,=%, d= %—-%, (l<i<(n—1)), d, = l-t, 
so that d; is the ith interval along the line. These {d;} can be 
magnitude. Letting the smallest have rank 1 and the | 
as {g;}. The positional rank of the i 
as r;. The series (o3 


given a rank according to their 
argest rank n we may write these 
hich is the ith largest in size we write 
ment of the series {d;} will be 


fg. xg, 


nterval g; w 
which is merely a rearrange 


The magnitudinal rank of d; which we do not use except for computational rearrangement 
we may write as rj. We have then that 


er.) Using the 
the ith interval 


^ n 
Gi) R= Sir, 
i-i 
(i) Yi. 
i=1 
Gü) Go Sng, 


(iv) G* n Y dag, 
i=1 


D. E. Barton anv F. N. Davip 105 


3. Rasa test for r: 
bing by Hot eri 5 an has been discussed on several occasions, the first, we think 
1 c Pabs 36) in a bivariat licati i i j 
sale fnis e application. They showed that Ri i 
j ed as a Pearson Type I. O ene ee 
Swim ype I. Olds (1938) tabulated a related i 
2 to 10 after which the r i rae ep dre 
bus ede he result that fn y distri r 
ee gaa vid is normally distributed can be used. The 
" a 
Y-2YXid;-n ms 
i=1 i=1 
ables, and since we consider them all the fact 


The 
x, are (n—1)i 

„are (n — 1) independent rectangular vari 
of their sum. We have, using the 


that tl n : 
PE a y ordered will play no part in the distribution 
all (1927) and Irwin (1927), that 


7 1 E z 
p(Y)- a ia 1yi+ ^1 6n jar 
[(n—k) > Y>(n—k-1); k= 1,2, i(n- 1)]- 


of great importance, since the mean of (n—1) 


The c A 
exact distribution is not, however, 
to be normally distributed with 


indepe 
agi rectangular variables tends very quickly 
ing n. We have approximately that 


Y: "ES l pec 
Is à nor ; 
normal variable with 1 
c "dl n m ý ua : 
&Y)-tu var(Y') = Ima’ 


we may use the fact that the percentage 


nz25. Forn small 
ton’s (1953) tables of y3. For 


are embodied in Bar 
12: ud e 
n—l ( 2 * 
-tailed test is requir 
Y will obviously be sensi 


ility is changing along the 
ances it will be the most po 
ype where à long inter 


Which i 
Bois is accurate enough for 
8 for a rectangular sum 


ed we may use the fact that Y has 
tive to departures from 
line; in fact, as we shall 
werful test 
val 


is ex 
a E dibtributed as y3. Ifa one 
adarte i distribution. The criterion 
iscuss in E : the type where the probab 
Which can b urther paper, un 
tends to p, e used. It will not be sensi 

4. If e followed by a short one. 

we allow s to equal unity then 


der certain cireumst: 
tive to fluctuations of the t; 


G is Y. It occurred to us that the discussion ofa 
the zero intervals 


Partiaj 
l sum of intervals, such as that defined by 6, might be useful in that 
be eliminated. (This lack of refined 
rvals, but on 


Aisi, 
EL lack of refined measurement could thereby be € 
Ne whol ent will lead to a bias in that it may overemphasize the larger inte à 
istributi this bias does not seem to be o i .) The derivation of the 
Derg fan of G is theoretically possible following the ale r two earlie 
56a, b). Since the derivation is the same, consider a m 


a developed in ou 
ore general form of G say 


(s<t<n)- 


b 


[ 
G,-7 X nf 

"der t} ps 

y 2e null hypothesis 7; and g; ave independent and we have 
8 

84 Deora «+ gi) = Pls 7) p(958ss1 777 gi) 

;-1 x 

<i Se dee e 


j-0 


aser Mee gy 


106 Tests for randomness of points on a line 
Substitute for g,: 


s (-1y, , (n—t41),, 

T- . - | G, 

B('stues -+ Tiga ues «+ i3 Gat) LP Peony ... r) Pp " C; i; 1 
4 =) n—t+s—2 
(MET oe ea -1)| s 

t «( " = (JT) Jesus B "n ta 
Integrate out for 91; --- Js in turn and at any stage the inequalities 
Pa e Teresa rada yg Ee Fes (k =t-1,t-2,...,8+1) 
a T uses En 
=g 


will hold. The conditional distribution for Gy given a set of ranks is thus proportional to 


s—1 t-s 
DTI S 
jl ime t i-8-1 
[-ayes $ mA 
f=t—w+1 = 7 " 
s t—s—w t—w t ]w-1 t-v d " 
r(n-t+w)-l > n | Dy (n—t--v)r;— (w — v) > «| 
i m3 i ) i=t w41 s Z ie ual ia piw 
t 
E. [eter m teen) P «| 
f-i-w4l 
here (n=s+j+1) nor (n—t-w),, Vn-2 
Mg A= iE = ee 3 Le, 
Sy) r 
gT fei iil 


The complete distribution of Gy will be obtained by summir 
over all possible permutations of the ranks {r;}. Since we h 
conditional distribution we have not pursued the complete 
be intractable, Previous work (Barton & David, 1956 b) on 
the g's showed that the normal approximation was fairly g 
anticipated that the Same would be true for this present c 


Special cases that the A, and f,— 3 are of order (n), 


5. The moments of Ga may be derived from those we have already given (1956 a) for the 
Powers and products of (gj) together with the moments of the finite population of natural 
numbers given in Table 2. Let 

E Ugh) = Elg,- Elg) 


the corresponding cumulant. We have 


ng this conditional distribution 
ave been unable to simplify the 
distribution as it will obviously 
the unweighted partial sums of 
ood for n> 20, and it might be 
ase. We show in $7 for certain 


and «(gh 


t 
Ea) = Mn+) Sata) 


where li 1 
K(g;) = ng nji E 
2 TI à t È D * 
Alà) = T [z Su?» (| +e È SB (0-1) 43 [x sof- 32 si? | 
where a i 1 
S(i2) = 
uu rss) 


In terms of the cumul: 


107 


nt may be written 


D. E. Barton anv F. N. DAVID 
ants of the individual g's this second erude mome 


Qi D) Qn 1) 4 go MENGE s gig) 
Xu. uU 


óé(G3) = $ 


n+1)(3n+2 ,, n(n+l ? 
UE On Ds gp t lS fkg. 
= n, and ofits standard error for different 


hen £ 
— to facilitate com- 


Values of the expected value of G, i.e. of Ga W 
They are multiplied by z 


values of n and s are given in Table 3 below. 
Parison with their asymptotic form. 


6. The higher moments of Gy can e method. Thus if 


also be obtained by the sam 


- Sci") 8g) (ti). (77a D 


So Mz Mna 
we have 
Sust ssp + 289 
135,89 - SB + 689+ (n —0 (8s — 80) Sa) 


With a more complicated expres rth crude moment which we do not reproduce 
here. In the case in which we are rested, that is, when tequalsn, the moments 
may be considerably simplified by expressing them as polynomials in S((s— 1)" ), with 
Coefficients as rational functions of ? and k = n—s+ 1. The S((s— 1)*) are just the differences 
of the appropriate polygamnma function of n and k. Thus if 


1) = 1+F(n)- F(k)=T, (say) 


2 (+1),, oQ F 3 
)= Fr + St ASS FHT 


sion for the fou 
particularly inte 


1+S((s- D 
suet) =F + FH) = T. Gap) 
po se-m- [as eee [i-t 


the central moments of G are (0r follow from) 
LER 
pom KT, 


k , 
P = qaa LT Ion 4nb— 3k) - n (anke 2k +m) + 2n(2n 4 V]. 


p? oe t(n3— — ant- 2k) T — 6n 2k) 
EY) E em psp +1) +7 TonQ 
Inna). gl ( ) 1^2 + TS k(3nk + 2k — n?) + 2n?], 
Mm ESTERI 8 saaga taan 12) + (200! + 24n? — 5n — 6) 
240n(n + 2) ara m 
4-167, [n (97? + 15n4-4) + kn? 21n—4)] 
4- A(T? +T) [2n2(n 2) 4- nk (400? + 69n + 23) Ak (30n? + 35n2— 11n—12)] 
1 2. 
HE 6T1T, 3T? 8T Tot 67, [- 21 (n 1) +nk(5n?+ 9n + 2) 
di 


sonein 26+ 8) € LS + 169— 10n - 8)). 


108 T'ests for randomness of points on a line 
7. We now consider the case where the ratio s/n is fixed and equal to p say. Let 
q=1-p, Q=1-logg. 


Since the Euler-Maclaurin expansion of T, yields 


n-Q9«o(). Tasta meo n) wv. 


1 ! 
n niq! u+l 
we have for the cumulants of G 


Met LO, se TC —3q-- Q*U +49)) +0(1), 


= S PHAN — 5g 29?) Q-- 24039 — 1) Q3] -0(1), 


p= 24 (837 — 1464g + 11404? — 24593) + 120(17 — 929. 5q? + 2093) 

+ 4Q°(5 — 2719 + 91492 — 35093) — 2Q4(1 + 4q — 168g? + 38493)] + O(1)- 
These reduce to the values for the re 
(i) the actual values of np, and n(f5 


term of their expansion as a powe 
given in Table 3. 


8. It is possible to obtain the moments of G using a moment-generating function of 


à certain simplicity. From our previous work, if W, denotes the weighted sum of gy, .... (f 
with weights Tis ..., T, We have 


ctangular mean when q = 1 = Q. A comparison of 
— 3) calculated from the formulae of $6 with (ii) the first 
T series in n~ using the formulae immediately above i$ 


eo 


D Og reae, ys = er. Way) 


m=0 E 
k k 
Xn x«l. 
| 1- (1- terne, m UU fe die 
PEUT BEE) sy n 


Hence if the {r} are randomly drawn from the finite 
independently of the (gi), if t = n, and 


population of the first n natural numbers 


bs 1 
r= pit +r), 
we have Bin-1j m i N 
dli PLC 
where bij 


> MIT 
üásüc..«i 


‘ty 
v 


is the homogeneous product sum of weight v. Let 
Es = OFF; ... T.) Gxj«...«k). 
We can work out simply the corre, 


sponding A 
sums of the moments of the rs wh 


-functions which are seen to reduce to linear 
ose coefficie 


nts are polynomials in k. The Z's are given a 


"S P 


= EN 


IRS 


D. E. Barton AND F. N. Davip 109 


funetions 1,5 4. as they a u 
of i,j and the moment i 

EE s of the finite lati i 
E Pi Be mi population in Tabl shey qui 
general and of some intrinsic interest. The quantities eee 


Mi) 
n—i4l 


are si - B 
VERO M in terms of the corresponding power sums (David & Kendall, V, 1953) 
Dsum cen = AS and the moments of the finite population of the first » natural 
The has : ^ ne moments of G follow immediately after a little reduction. 
TN m E, o A for the mean and variance agree well enough for n = 25 for them 
P nens T pe 25in forming the test function in place of the true moments based on the 
S The i i course if tables of these last are easily available they may be used. 
Sist keins aa of squares of intervals along a line has been discussed by various authors, the 
She, b vene (1946). Approximations to its distribution have been suggested, but 
PM hn ution has not yet been obtained except in the cases n = 2, 3 (Moran, 1947). 
fis aas f ore to be expected that the distribution of G* would be equally elusive. and 
found es e proved to be the case. The exact distributions of G* for n = 2,3 have been 
Ber, of hey proved to be of much greater complexity than those found by Moran for the 
quares. The moments, however; are not difficult. Straightforward algebra gives 
é(G*)-— 1, var (G*) = (WF + Tn — 6)/(n 4 1) (m+ 2) (n 4- 3). 
The shape constants are 
Bes 16(n + 1) (n - 2) (m+ 3) i? - B1n2— 133n-- 60)? (n23) 
(n— 4 (n +5) (n2 + n — 6)? 7 


=0 (n-2) 
By = 3D ve 2) t 3) (n5-4- 6n 446391 — 15364n? +17172n—5040) > 4) 
(n4- 4) (n+ 5) (n+) (n 7) (n3 + 1n — 6)? 
-9$ _(n=3) 
=25 (n=2) 
nd are not 


om the distributions & 


2, 3 were derived separately fr 
of n were calculated. 


B. 
hey en from the general formula. fi and f for various values 
Pinos re found to behave in similar fashion to those of Greenwood's sum of squares; as 
very eases they diverge from the normal point and then turn back approaching normality 
Ro Table 5 gives & selection of these values. Approximations to the percentage 
st * of these distributions can be made by using the appropriate Ponraon curves with the 
d our correct moments or Johnson's system. The percentage points actually quoted in 
Outside were obtained by interpolation into Table 42 of Pearson & Hartley (1954), and 
Deroe € the range of the tables were estimated from à J ohnson Sg eune: The standardized 
10 ntage points are also given to make interpolation easier for intermediate values of n. 
epa; Earlier we remarked that the criterion Y could be expected to be sensitive to 
Unit ee from randomness where the probability is changing monotonically along the 
d ine, This remark will also hold good for G. G*, on the other hand, like Greenwood's 
; 11 of squares, may be expected to be sensitive to any type of departure from randomness 
be the intervals become either too regular or too irregular. For interest, we have 
Puted the four criteria which we mentioned in $2 for the data in Table 1. 


112 


Tests for randomness of points on a line 


Table 5. f. fl, for the distribution of G*, sample size increasing 


w 


| | 
40 | 100 400 | 1000|% 


$|4|5]|e€6]|3 ee Mee ae | 
| | | | | | 
Alo | 096 1-97 | 266 | 3-08 | 3-30 | 3-38 | 3:36 | 339 — 2.06 | 0-90 | 0-26 | 0-05 | 0-02 7 
| | 0-96 | 1-97 | 2 : -30 | E 29 — 20 26 | 0:08 | 0- 
fal 2-78 479 | 117 | 9-16 | 10-66 | 11.74 | 12-46 | 1292 | 13-17  1L54| 7-67 4-51 3-26 | 3-09 
Table 6. Percentage points of G* 
[G* — standardized G*] 
n= 40 | n= 100 n = 400 | n= 1000 
| I 
Percentage | | 
point | r. | - - | 
G* G* | G* G* | G* G* | G* G* 
| 
| 
be — NT 
Upper 05 | 1591 | 3-71 | 1-325-| 3-24 1142 | 284 | 1.086 | 271 
1 L488 | $06 | 1278 | 277 | i98 251 | 1077 | 243 
25 | 1362 | 227 | 1217 | 9.17 1-103 | 207 | 1.064 | 202 
5 Lag | 183 | wis | "qos | 3.085- 1-69 | 1-053 1-68 
Lower 5 mS pan | osa oer | ge = 158 | 0-949 |. 1.61 
25 | 026 |-1-72 | 0819 | 1.89 0:906 |-1-87 | 0-940 |. 1.90 
1 0.657 |—239 | oH |—2537 | (889 |-221 | 0-929 |.2.95 
95 | 0602 |—250 | 0755+ | 9.44 | 0878 |-2-44 | 0.921 |_9.48 
 — a 
REFERENCES 


Barron, D. E. & Davin, F. N. (1956a). J. R. Statist. Soc. B 


HorzELLING, H. & Pansr, M. R. 
Irwy, J. O. (1927). Biometrika, 
Moran, P. A, p, (1947). Suppl. J. R. Statist. Soc. 9, 92. 


Or 


PEARSON, E. S. & HanrrEY, H. O. 


(in the Press). 
A Mathematika, 2, 150. 


+ (1955). Biometrika, 42, 223. 
6). J. R. Statist. Soc. 109, 85. 


(1936). Ann. Math. Statist. 7, 29. 


19, 225. 


DS, C. G. (1938). Ann. Math. Statist. 9, 133, 


Cambridge University Press, 


(1954). Biometrika Tables for Statisticians, 1 


= 


T 


» 


[ 113 ] 


PAIRED COMPARISON DESIGNS FOR TESTING 
CONCORDANCE BETWEEN JUDGES} 


By R. C. BOSE 


University of N orth Carolina and Division of Research Techniques, 
London School of Economics and Political Science 


1. SUMMARY AND INTRODUCTION 
o the theory of paired comparisons', M. G. 


In 

ar "e 
a recent paper, ‘Further contributions t 
designs, in which each pair of judges have 


Ker 

" A j k ; 
cert; dell (1955) considers paired comparison. 
am comparisons in common. Such designs should prove useful for testing concordance 


iion judges. He notes that designs of an optimum kind which balance by numbers of 
aro ic ait objects compared, numbers of observers on given comparisons and so forth 
ke er rare. Tt is the object of this paper to obtain some paired comparison designs which 
ii a high degree of symmetry. These designs have been defined in $2, and certain 

qualities between the parameters are obtained in $3. In $84 and 5, two special classes 


po designs have been investigated, and explicit designs for small values of n (the 

"bel ie of objects to be compared) have been given in Tables 1and 2. The method of analysis 

This d, to a certain extent, depend on what use the experimenter wants to make of the design. 
question will be considered in a future communication. 


My thanks are due to Professor Kendall for suggesting the pr 


lSeussi $ : 
ission during the preparation of the paper. 


oblem, and for helpful 


9. DEFINITION OF LINKED PAIRED COMPARISON DESIGNS 

Suppose it is required to compare ? objects, by employing ¥ judges. Each judge compares 

E airs of objects (r > 1), and in respect of each pair expresses his opinion whether he prefers 

a or the other object of the pair. In certain circumstances it may be desirable to allow n 

Ta € to express no preference with respect to either ofthe objects forming the pair. In this 
we say that the preference is equally shared by the two objects. We shall assume that 

ya Pairs compared by any judge are all different. To ensure symmetry between objects and 


Ju 
[ow we require that: 
* ) Among the r pairs, compared by 
ps times, 
(b) Bach pair is compared by & 


each judge; each object appears equally often, 


judges, k> 1. 


exactly A pairs whic red by both judges. 


h are compa 


(c) Giv 

“iven any two judges there are l à à z 

esigns satisfying these conditions may be called linked paired comparison designs. 

Clearly r= dna. (2:1) 
b = 4n(n— 1). (2-2) 


he 
n P <4 
umber of possible pairs 18 de 
designs and balance 


n linked paired comparison 
d to a treatment, and 


There j 
theo ere is a certain correspondence betwee 
t Mplete block designs. Each judge may be considered to correspon 
Thi 
F ae research was jointly supported by the ae 
lo s of the Air Research and Development Comma 
n School of Economies and Political Science. 


States Air Force through the Office of Scientific 
and, and the Division of Research Techniques, 


Biom. 43 


114 Paired comparison designs for testing concordance between judges 


each pair to a block. If a pair is compared by a judge, then the block corresponding to i 
pair may be considered to contain the treatment corresponding to the judge. Honos i 
a linked paired comparison design of the type considered exists there must exist a corre- 
sponding balanced incomplete block design with v treatments and b = 4n(n—1) blocks, 
such that each block contains k treatments, each treatment occurs in r blocks, and two given 
treatments occur together in A blocks. It follows at once that 


bk —vr, A(v—1)-— r(k— 1). (2-3) 
Using (2-1) and (2-2), the first of the relations (2-3) can be written as 
va = k(n— 1). (2:4) 
Also Fisher's well-known inequality (Fisher, 1940; Bose, 1949) gives F 
bev or rzk. (2-8) 
Therefore na > 2k. (2-6) 


It should be remembered that the existence of a balanced incomplete block design with 
parameters v, b = 3n(n — 1), r, k, A does not ensure the existence of a corresponding linked 
paired comparison design due to the additional restriction (a). Clearly 


r>A. (2°7) 


The case r = A is trivial, since in this case all the r pairs compared by any judge must also 
be compared by every other judge. This means that each judge compares the same pairs- 


Condition (b) then shows that there must be exactly k judges, and each judge compares 
every pair so that r = n(n — 1). 


3. SOME INEQUALITIES 


From (2-1) and (2-4) k _ k(2r ) 
Bi e: pis 
Hence for (2-3) "Wess me E a? | ar (ka +a? — 2r) (3:1) 


k(2rC«x)—a? 2 2(2kr — ka — a?) 


This shows that for a given b, A decreases as r incr 


eases. Now rz A, and when r = A, We 
get from (3-1), 


r —A-la(x-1). 
r2io(x--1, A<ła(a+1), (3:3) 
where the equality holds only in the trivial case r — A. 


Since A must be a positive integer, (3-2) shows that æ = 1 must imply r=A=a= We 
Each judge therefore compares only one pair. Hence the case c = 1 is impossible excep? 
in the trivial case when there are only two objects and each judge compares them. 

Again we can write (3:1) as 

Ti UU" &*(k — 1) (e+e) 
2k 2k(2kr — ka — x?) 


It follows from (3-2) that 2kr — ka —o? 2 (k — 1)o. 


This proves that 


(3:3) 


Hence the second term in (3:3) is positive. We thus have the inequality 


A» f= Iak. (3:4) 


AL 


a R. C. BosE 115 
ince k> 2, combining (3-4), (3:2) and (3-1), we have 
ła?< A&da(a 1), (3-5) 
dA, Erir- a—2] 
(3:6) 


di^ Okr- ka-a?) 


Tt follows from (3-2) that dàjdk> 0. 


Hen : 
ce : i : : 3 
Ais a monotonically increasing function of k, for a given 7. 


4. LINKED PAIRED COMPARISON DESIGNS WITH & = 2 


The i ^ 
cO e Ed (3-5) shows that when & = 2 (neglecting the trivial case r = A), we must 
can be = 2. Using (2-1), (2:2). (2:3) and (2:4), we see that all the parameters of the design 
expressed in terms of n, the number of objects to be compared. In fact we have 
v—1(n—1)(n—-2) b= jn(n-1, r=% bonan-2, A=2, «22. (41) 
T zi : 
he existence of (4:1) implies the existence of the balanced incomplete design with 
Parameters 
v=Hn—1)(n-2), b= 4n(n-1). 7=™ k-2n-2, A=2. (4-2) 
is known to exist 
nin fact be derive 


block design, 


E es incomplete block design (4:3) for the values a= 4, 5, 6 
E M e & Yates, 1938; Bose, 1939), and ca: d by first writing down 
n of the symmetrical balanced incomplete 

g-b- jn(n- 1) pok=n, A-3, 
lock. This results in two treat- 
any block of (4:3) has 
ly, it is known that the 


(£3) 


all the treatments in this b 
m -—1) blocks, since 


and 
d then deleting one block and 
f the other in( 


men : 
Ea being deleted from each © à 
exist YA = 2 treatments common with every other block. Converse d 
: ence of (4-2) implies the existence of (4:3) (Connor, 1952; Connor & Hall, 1954). Hence 
Don-existence of (4-3) for any value of ? implies the non-existence of (4:2) for the same 
f r the non-existence of (4:3) are known (Shrik- 


Valu 
Es Of n. Certain sufficient conditions fo ! 
P 1950; Chowla & Ryser 1950). In particular, the cases 2 = 7, 8 and 10 are impossible. 
e desi . oe 
gn (4-2 d of (4:3). 
mb dulade kr. cal balanced incomplete block 


e shall n of the symmetri 

esi ow show that for any solution © i i ; : 
(4 Sign (2:3), wenan derives orresponding solution of the linked paired comparison design 
1). The process of derivati will first be illustrated by considering the special case n = 0. 
ordenats ‘on for (4:3) in the special case n = 6 can be 


t ig k 

no : r 
obtained om (Carmichael, 1991) s he cells of a 4 X 4 square and then taking 
VA uie hei RE he same row or the same column as a given 


lock 
S the si i nt : 
ne eis treatments which oon? | . We shall take the sixteen 


Satme: i 
nt (b ; S k this treat 
Eur m aY* s M ` d. e. f. 9: h, kand arrange them as shown below: 
,1,2,9, $, 9 a 9p Us Coa ve 
m dE 
(4-4) 
8-2 


116 Paired comparison designs for testing concordance between judges 


Then the 16 blocks of (4-3) for the case n = 6, are given by 


Ly 2, 3, 4, 


o 
a 


(4:5) 


3 4, a, b, s? k 
3 5, €; b d, e 
25 6, c f: g, h 
4, 8. 0, g. h, k 
4, 6, 0, d, e f 
5 6, 0, a, b, c 


A solution of the derived design (4-2) for the case n = 6 is now obtainable from (4:5) 
by omitting the first block in (4-5) and the treatments 1, 2, 3, 4, 5, 6 from the remaining 
blocks. Thus the derived design is the part included within the lines in (4-5). We can make 
ie a, 1) correspondence between the blocks of the derived design and the unordered pairs 
(5,3), 4, j = 1, 2, 3, 4, 5, 6; i+ j, since each of these blocks has been obtained by deleting just 
one such pair, from the corresponding block of (4-3). Thus the block 0, c, f, k corresponds to 
the pair (1, 2) and so on. Now let us identify the ten treatments 0, a, b, c, d, e, f, g, h and k o 

the derived with 10 judges, and assign to each judge the pairs corresponding to the blocks 
in which the treatment corresponding to the judge occurs. We thus get the linked paired 
comparison design no. (3) of Table 1. Since each of the treatments 0, a, b, c, d, e, f, g; h, k 
occurs in exactly r = 6 blocks of the derived, each judge is assigned exactly six pairs, and 
since any pair of treatments occurs in exactly A = 2 blocks, any two judges have just tw? 
pairs in common. Again, since each pair of the derived has exactly k = 4 treatments, each 
pair is compared by just 4 judges. Finally, let i be one of the six deleted treatments 1, 2; 9: 
4, 5, 6 and let x stand for one ofthe ten treatments of the derived. Then 4 and z occur together 
in only two blocks of the symmetrical design, for example, 1 and a occur together only in 
the blocks (1, 5, a, g, e, f) and (1, 6, a, d, h, k). Hence the object 4 occurs twice among the 
pairs compared by the judge x. This shows that each of the six objects 1, 2, 3, 4, 5, 6 occu? 
twice among the pairs compared by a judge. 


R. C. BOSE 117 


a EIS iuis can be used for obtaining à solution of (4-1) for any value of n, for which 
mm " of the symmetrical balanced incomplete block design (4:3) is known. We first 
^ie d Te such a way that the first block contains the treatments 1, 2,...4 m. 
Giese ho the remaining nin » 1) blocks contains just one of the unordered pairs (i, j). 
with in E i+j. The remaining 4{(n—1) (n—2)] treatments may then be identified 

ges. If x be one of these treatments, then the judge x is assigned pairs (i,j) corre- 


S : 
aire to those blocks in which x occurs. 
for = solutions to the symmetrical balanced incomplete block designs (4:3) are known 
he cases n = 4, 5 and 9. All blocks that can be obtaine 


bl 
ock mod v. These initial blocks are given blow 
Initial block 


d by developing à suitable initial 


n 
4 (03,50 

5 (1, 4,5, % 3) 

4, 26, 9, 33, 10, 12, 7) mod 37 


mod 7 


mod 11 


9 (1, 16,3 
tox corresponding linked paired comparison. designs are the designs nos. (1). (2) and (4) 
able 1, 
5. SoME OTHER TYPES oF LINKED PATRED DESIGNS 
e the t(2t— 1) pairs into 


Le E 
t the number of objects n be even: say n = 2t Then we can divid 
j xactly once among the pairs of a set. 


Sets of ¢ pairs each, such that ea ere ae 
, 2, 8, 4, 5, Sando: hen the seven 


tte ome Ph ifa = 8, we can take the objects to be 0, 1 
à Sets Pairs 
I (1,6); (25): (3, 4), (0,00) 
II (2,0), (3; 6); (4, 5), (1,00) 
TII (3, 1), (4, 0); (5, 6), (2,00) 
IV (4, 2), (5, 1), (6,0), (3,00) (5:1) 
V (5, 3), (6. 2. (0 1), (4,00) 
VI (6,4). (0.3). (1, 2), (5,0) 
vil (0, 5), (1,4); (23) (6,0) 
31 — 1), the initial set 


tained by developing mod ( 


"m (0,00) (5-2) 


] sets can beob 


I 
n the general case, the 21 — 
(2, 2t-3) 


(1, 26—- 2), 
v* treatments, b* = 2—1 
ry pair of treatments occurs together 


cks, p% F i 
» ** replicati ize k* and in W 

plications, block SY - correspond to one set and each treatment 
omparison design by assigning 


ing 
ne 
Atam block A* times, and make e ed paired c 
ond to one j then obtain à linked pair 1 
Sp ach judge no judge. is correspon ing to all blocks in which the treatment corre- 
Ondi s 0 : D e e -— vi z 
»i ding to the judge occurs. We obtain m this way 9 linked pairec comparison design 


Pàrameters 


blo plock design with 


A=, = r*. (5:3) 


d incomplete block design 


pede b= k*. 
t£ with the balance 


yeas, At ah 


n=2U, v-vt, b= 2t - D: 


g, we may star 


F 
Soror d 

Xample, in the case n = 
r* = 3, 


a 
Parameters "E Eus 1, 
azi ? 


118 Paired comparison designs for testing concordance between judges 
If the treatments are taken as a, b, c, d, e, f, g, then the blocks are 


a, b, d 
5, € è 


If we make them correspond to the sets I-VII given by (5-1), then we get the following 
linked paired comparison design: 


Judge Sets of pairs 
a I, V, VII 
b IL VL I 
c IIT, VII, II 
d IV, I, III (5:8) 
e MAT V. | 
f VI, III, V i 
g VII, IV, VI 


Table 1 


Design 
No. Parameters | 


Judge Pairs assigned to a judge 


(1) 


Se 
273 Q 


e (1, 4), (1, 3), (2, 4), (2, 3) 
(L, 3), (2, 4), (1, 2), (3, 4) 
e (1, 4), (1, 2), (2, 3), (3, 4) 


L 
bo w 
e 


yog goa 


Rc 
todo 


(3) 


(3, 5), (2, 4), (1, 3), (1, 4), (2, 5) 
(2, 3), (3, 4), (1, 4), (1, 5), (2, 5) 
(2, 3), (3, 5), (1, 2), (4, 5), (1, 4) 
(3, 5), (1, 2), (3, 4), (2, 4), (1, 5) 
(1, 2), (3, 4), (4, 5), (1, 3), (2, 5) 
(2, 3), (4, 5), (2, 4), (1, 3), (1, 5) 


Rares 
oi gg 
6 &0ocn 


(3) 


© 


(1, 2), (1, 3), (2, 3), (4, 5), (4, 6), (5, 6) 
(2, 3), (2, 4), (3, 4), (1, 5), (1, 6), (5, 6) 
(1, 3), (1, 4), (3, 4), (2, 5), (2, 6), (5, 6) 
(1, 4), (2, 4), (1, 2), (3, 5), (3, 6), (5, 6) 
(1, 4), (1, 6), (4, 6), (2, 3), (2, 5), (3, 5) 
(1, 3), (1, 5), (3, 5), (2, 4), (2, 6), (4, 6) 
(1, 2), (1, 5), (2, 5), (3, 4), (3, 6), (4, 6) 
(1, 4), (1, 5), (4, 5), (2, 3), (2, 6), (3, 6) 
(1, 3), (1, 6), (3, 6), (2, 4), (2, 5), (4, 5) 
(1, 2), (1, 6), (2, 6), (3, 4), (3, 5), (4, 5) 


Races 
out il 
ERS 
~se 
oi dl 
Yom 


m £u.onocnuno 


R. C. Bose ' 119 
Table 1 (cont.) 


i 


Linked paired designs with a —2 


Design 


k 
No. REE 


Judge | in 


(5, 8), (6, Ths (7,8) (8, 9). (1 9) (As 2), (2, 3) (9. (e 
(0) (6,8), (1, E), (5, 7], (4,9), (9), 0.4 6 (99 
| (5:8), (3,0), (2 9), (Bs Ts (Le D) (2, Os (4 Bb (0 (89 
(7, 8); (3, 4), (2, 6), (2, 8), (6, 9), (3, 5), (l; 7), (1, 4). (5, 9) 
(l; 2); (1. 6), (3. 5), (4. 8). (3, 8), (2, 7), (6, 9), (4. 5), (7, 9) 
2:3), (2, T), (4, 6), (8, 9), (4, 9), (3, 8) (L T) (5 9) 

(2, 4), (5, 6), (1, 5)s (7, 9) (3; 8, (3, 7) 
(l, 3), (5. 8), (2, 8), (3, T), (5; 6), (1; 2) 
T 9) 9), (4 8), (BD, 0,2), (4 OD (DB) Mh OO 
(b 2» (1,9) (4.6) (Be Ae (Le 8) (Bs 9 (126 Bh Oo p 
(1, 6), (2, 9). (2 4). (1, 8), (7, 8) (3: 5) 

C T 43,7), eB (9). (Bs 9) (8.7 Gb S y 
d 2), (Ly (2 T) (L 3), (3s 5) (2 9), (8 9) 
(4, 5), (4, 9), (6, 8). (2 7), (2 6), (1, 5) 
(8, 9), (3, 6), (4, 9). (3; 5)» (5, 7) 


yog og og 


RSZ 


QmOUR 6 o0 cR 


(4, 9), (6. 9), (4. 7)» 


M MM = 
P 
ʻa 
= 
e 
m 
2 


* (2, 7), (1, 9) (5, 6) (5 8) (b 4), (2, 8), (5. T), (3, 9) 
(8 S) gh (be (BT (Be 9h Cb DBs Oh e y (s 7 
u s 9) (1, 7). (5s 8), (2, 4), (6 9), (3, 9). (4 8), (6, 7) 
Eam Ea Baa eat 
2 5, 9), (2, 5). 3, 9), (6, 
(1, 4), (4 7) (3, 8), (L 2. (6 7), ( 5), (3 
9 5, 9), (L 8). (1, 3), (75 9), (6, 7), (2. 4) (3, 6) 
(4, 8), (2 9 (8 9} ( (6, 8). (9. 6). (1, 3. (2 2» e D 
7 (27). (1, 6), (5, 8) (3, 4), (6, 8)» C, 
i) (0. 00. 09. eA} eG, (ra 


x 

y 

: 5,9, (2 i 

j (hg), 2, 5), (Bs 8). (6 9) 


>, 

| 

| Th, 

i e Parameters of the design are f 

| n=8, v-7 yung) re Dh ES E Sr 

li Each ; zu VES : s, Other designs obtained in this way are 
ee 98 possible pats- c | 

Í i E TER bua ete lis 2. It should be noticed that the design (1) of Table 2 
i E he d tained in à different manner. 

i iin lt the sd x Pae x n= 2t+1. Then we can divide the (204-1) 
j Pairs into ¢ fed number of goes such that each object occurs exactly twice among the 
i ae ofa os zd (264+ iei s 1 "ye can take the objects to be 0, 1, 2, 3, 4, 5, 6. Then the 
* . r example, 1 EN 
[ ee Seta are: : p Ad 
3 4,5), (5, 6), (6,0) 
1, 2), (2; 3), (3; 4) (4,5), (9; P 
- ài n i 5 5 (2,4), (35), (4) C. b a 2 (5-6) 
Pea AA 2)? 6), 4, i (5,1 6 

k m (0,3), 1, 9, 5) © ), (4,0) 


120 Paired comparison designs for testing concordance between judges 


In the general case we can take the objects to be 0, 1, ---, 2t. Then the ith set consists of all 
pairs for which the difference mod (26 4- 1) between the objects constituting the pair is ù. 

Let us now take a balanced incomplete block design with v’ treatments, b' = t blocks, 
7' replications, block size k’ and in which every pair of treatments occurs together in the 
same block A' times, and make each block correspond to one set and each treatment to 
a judge. We then get a linked paired comparison design by assigning to each judge the sets 
of pairs corresponding to all blocks, in which the treatment corresponding to the judge 
occurs. We obtain in this way a linked paired design with parameters 


=2t+1, v=v, b= 42+ 1), r=(2t4+1)r, k= k, X= (24 1A, æ= 2. (5°7) 
For example, in the case n = 7, we get the linked pai 
Table 2. It should be noted that the sets obtained in 


round the preference polygon considered by Kendall (1 
considered by him. 


ired comparison design no. (5) of 
this case are the same as the tours 
955), and lead to the designs already 


Table 2 
Design 
No. Parameters Sets of pairs i 
Sets of pairs 
Judge | assigned to a judge 
(1) n=4, y=3 I (1, 2), (0, 20) a II, III 
b=6, r=4 XT (2% 0), (1, oo) b III, I 
k=2, A=2 III (0,1) (2, oo) c IIT 
a=2 
(2) n=6, v=5 I (1, 4), (2, 3), (0, co) a II, III, IV, V 
b—15, r=12 II (2,0) (3, 4), (1, co) b I, III, IV, V 
k=4, A=9 II (3, 1), (4, 0), (2, oo) c I, II, IV, V 
a=4 IV (4, 2), (0, 1), (3, co) d i IIT, Y 
V (0.3) (1, 2), (4, c») e I, II, III, IV 
(3) n=8, v=7 
b=28, r=12 | 
k=3, Awa The sets (5-1) The design (5-5) 
a=3 
(4) n= 8, v=7 a II, V, VI, VII 
Aa 28, r= 16 b IV, VI, VIL, I 
=4, A=8 c | XLVXIL-I, GI 
a=4 The sets (5-1) d VI; L TE TII 
e VII, II, III, IV 
Fig I, III, IV, V 
g IL IV, V, VI 
(5) N=7, v=3 i 
a LL, 
b= Ss = 14 The sets (5-6) b me 
=% A=7 , 
a=4 ý Lu 


i 


R. C. Bose 191 


REFERENCES 


Bose, R. C. (1939). On the construction of balanced incomplete block designs. Ann. Eugen., Lond., 9, 


353-99. ' 
Bosz, R. C. (1949). A note on Fisher's inequality for balanced incomplete block designs. Ann. Math. 
Statist. 20, 619-20. n" 
CARMICHAEL, R. D. (1937). Introduction to the Theory of Groups of Finite Order. Boston and London: 
Ginn and Co. 
Cr A, SL & z: inatorial problems. Canad. J. Math. 2, 93-9. 
owra, S. & Ryser, H. T. (1950). Combinatoria! p Por du 


Connor, W. S. (1952). On the structure of balanced incomplete block designs. Ann. 


, 23, 57-71. 

Connor, W. S. & Harr, M. (1954 
Canad. J. Math. 6, 35-41. 
Fisner, R. A. (1940). An examination of t 

blocks. : ., 10, 52-75. ee i . 
Fisier, R. rur mais pac Statistical Tables for Biological, Agricultural and Medical Research. 
London and Edinburgh: Oliver and Boyd. 
Kenpatn, M. G. (1955). Further contributions 
43-62. : : ] i. 
lucem S. S. (1950). The impossibility of certain symmetrical balanced incomplete block 
designs. Ann. Math. Statist. 21, 106-11. 


) An embedding theorem for balanced incomplete block designs. 


he different possible solutions of a problem in incomplete 


to the theory of paired comparisons. Biometrics, 11, 


[ 122 ] 


ON THE DISTRIBUTION OF THE LARGEST OR THE SMALLEST 
ROOT OF A MATRIX IN MULTIVARIATE ANALYSIS* 


By K. C. S. PILLAI} 
University of North Carolina and Uni versity of Travancore 


1. INTRODUCTION 


This paper deals with the distribution problem ofthe extreme characteristic roots of a matrix 
in multivariate analysis (Roy, 1939; Hsu, 1939; Fisher, 1939). Roy (1945, 1953, 1954) has 
discussed in detail the usefulness of the extreme characteristic roots in tests of hypotheses 
and confidence interval estimation in multivariate situations. 

Forobtaining tests of three different types of hypotheses in multivariate analysis, namely; 
(i) that of equality of the dispersion matrices of two p-variate normal populations, (ii) that 
of equality of the p-dimensional mean vectors for l p-variate 
(iii) that of independence between a p-set and a q-set ( 
normal population, in each case we arrive at the char 
sample observations. In each case, if the hypothesis t 
(0<6,<0,<...<0,<1; s« p) have the same joint dis 
by Roy (1939), Hsu (1939) and Fisher (1939). The d 


normal populations and 
P € q) of variates in a (p q)-variate 
acteristic roots of a matrix based on 
o be tested is true, the non-zero roots 
tribution, the form of which was given 
istribution can be written in the form 
8 
P(O ...,0,) = Ols, m, n) IL0*0—5)" TI (6;,-0;) (0<0,<... <0,<1), (1) 


ij 


Cs, mm) = gi T] r(e metta) [ieee ? resin rgo, 8 


where 


i-1 
and where m and n have to be inter 
1954, 1955). 


The problem of obtaining the cumulative distributio 


preted differently for the different situations (Pillai, 


n function (c.d.f.) of the largest root 
Y (1945), who gave explicit expressions for the c.d.f. for 
(1948) also gave such expressions for s — 2, 3, 4 and 5. 
ded these results for the c.d.f. for values of s up to 8- 


e probability integral. However, if one considers only 


1 T percentage points (5% or less), the approximations 
suggested in the following section will be useful. 


2. APPROXIMATE FORMULAE FOR OBTAINING UPPER PERCENTAGE POINTS 


(5 % OR LESS) or THE LARGEST ROOT 

Let us first consider the case of tw 

only two incomplete beta functio 
Pillai, 1954) given by 

C(2, m, n) a . m 

Pr (0, < 2) = Eai 0g (1 — 05) 0, — zm (1 — yn , a - 6 ao) 1g 

* Part of a doctoral thesis, submitted to the De; 

Chapel Hill; work done under the sponsorshi; 

T Now with United Nations, New York. 


o roots. The c.d.f. of the largest root in this case involves 
ns and can be written in an explicit form (Nanda, 1948; 


partment of Statistics, University of North Carolin® 
p of the Ford Foundation, 


F 
For small values of m, (5) will give the p 


K. C. S. PILLAI 193 


For integral values of m, integration by parts of the two integrals in (3) will reduce the 


probability integral in (3) to the form 


Pr (0,2) C(2,m, n) $ (m); Tatje (1 Lay 
(m+n + 2)56 T(n+i+2) 
" 2 (2m = 1); T(2n + 2) gmn-ó(]l —gyeeeee 
ico T(2n-- i43) 


2T(2m.2)T (n2) (m 1) Té 1) Nw 
T(2m 4-2n- 4) T(m+n+2) pian’ 8) (4) 


where (t), = t(t— 1) ... (t-r + ))- 


In (4) if we neglect terms of order n+1, we get 


higher than gmt, (1—2) 


Pen+2) Ment 1 TG, is 4 
Cnr) Pim+n+ 2) amH(1-a) ah, (5) 


Pr (0,<a) = —— 
po T (2m 4 2n 4-4) 


C(2,m, n) 2r(2m 4-2) 
(m+n+ 2) 


Us; ercentage points for the upper 5 % level or less. 
Sing the fact that Pr (0.<1) = 1 we obtain (5) in the simpler form 
m+1)T(n+1)Temt 2n+4) ymi] xr. (6) 


T( LJ) 
T(2m 42) I2n42) 


7 9p 4 n2) 


If 

the probability in (6) has to be 0-95, then 

T(m 4 1) T(n--1) Tm € 283 7 +4) ina] - ant = 0:05. (7) 
T (m n4 2) Pm 2) rn? 

especially when the values of m are 
this approximation has been computed and difference 
robabilities has been found to occur in the fifth 
be at the most a difference of 1 in the 


The expression in (7) is very simple for computation, 


8 
Er error involved in using 
€cimal exact and the approx 
Ourth place, and, on rounding 9 
Place. Vy 
o Toceeding on the same lines as above, i.e. as for the approximation in 
Owing approximations for $ = 3, 4 and 9: 


Pr (9 es  T(mnt3).- 7 ge(1 — 65)" ds (m+1)( 
6e ia lh em | 
Lam -ay (m + 2n | , (8 


E [f opaa =) 46. (m+1) 


imate P: 
ff, there could 


(6), we get the 


2m+3) 


(2m 4- 2n 8) 


2) na 
om+ 2n 8) Fm 2) F(n- 2) gm+3(1 —&) 


Pr(9, <2 
(0,52) = 1- gris cn 3)1 


T( 2) (2m *- 2n * B) ta ape 
+4p(2m+ 9) T(2n+ 4) (n? +4) 
+2) Tn ore i amt(1— g)", (9) 


9) r(m+ j ro 
E Ip omite 


124 Distribution of largest or smallest root of a matrix 


___T(mtn+4) fionna " : i à : 
Fr(9,52) = gpim FTT 8) 77 * 9 8m 8) | OO 69 db, 
2 5) (2 2 7) (2m? + 2 - 13m 2) (= 
n: m +5) (2m + 2n +7) ( m? + 2mn 13m Y 1 ' or — 0, dO 
(2n. 4- 3) 0 
2 21 2 7) (2 2n + px 
(2m+3)( m+: ers aereo 02+2(1 — 0, d0; 
(2n +3) 0 


(m+ 3) (2m +5) (2m + 2n 4- 5) (2m - 2n 4- 7) 
(2n+3)(m+1) 
4.3 9m 2n 7) (2m 2n 4- 9) 
(m 4- 1) 


2m 2n 1) (2m 4 2 ( 
(2m t 2n+7) (2m - 2n 4- 9) (nn 4) uan iy . (10) 


gni gjet 


gray] jen 


(m 4-1) (m 4- 2) 
The incomplete beta functions in (8) and (10) can be further integrated by parts, and for 


small values of m the expressions will become simpler and could be readily used for 
computation. 


3. PERCENTAGE POINTS FOR THE LARGEST ROOT 
An important problem in multivariate analysis is to test the hypothesis that J p-variate 
normal populations having the same variance-covariance matrix, from each of which & 
sample is drawn, have the same mean for each variate. This, in fact, is the second one among 
the three tests discussed in the introduction of the paper. For this test it turns out that 


l—p-1|-1 v—p—l— 
m li-2-1|-1 and TE n (11) 


, For p = 2 wehavem = 1(1— 4), and the expressions for the upper percentage points give" 
in $2 which are good for small values of m become quite useful, because the number 9 
samples, J, cannot be too large. For example, the case of m = 4, i.e. 1 = 12, relates tO 
à problem with 12 samples. Using (7) tables of upper 5 and 1 % points have been computed 
for two roots for values of m = 0, 1, 2, 3 and 4 and n = 5, 10, 15, 20, 25, 30, 40, 60, 80, 100, 
130, 160, 200, 300, 500 and 1000. Significance levels for fractional values of m and inter 
mediate values of » can be obtained by interpolation. The percentage points based on thie 
approximate formula (7) are given in Table 1. 

In order to form an idea of the error of approximation, the probability based on the exact 
formulae is calculated at the percentage points for the parameters m = 2 and n = 1^ 
30 and 80 and m = 4, n = 5 and 100, picked out from Table 1 (based on the approximate 
formula (7)). The difference between the exact and approximate probabilities is exhibited 
in Table 2. It is to be noted that within the range of values of the parameters considered, the 
approximation is quite good. 

Nanda (1951) has given significance levels (upper 5 and 1 % for the largest root for $ = 2 
and very small values of m and n (m = 0(3) 2; n = 3(3) 10). His table has been computed by 
using the Tables of the Incomplete Beta Function (1934) for some values of m and n, the othe” 
values being obtained by interpolation. The significance levels are given only e. to two 


decimals, and the range of m and n considered is very small. In Table 1, however, the rang? 
for n is from 5 to 1000 and for m is 0(1) 4. The significance levels have been given correct to 


2 


[ 


SS a 
a a nD a 
— RD IEEE 


K. C. S. PLAI 


Table 1. Percentage points of the largest root for s — 2 


(a) Upper 9 o points 


125 


| 
| 
1o | 


m 
TS 0 3 3 4 
p | 
> 0-565 0-651 | 0706 | 0-746 0-776 
10 0-374 jas | QM Coe 0-598 
15 0-278 | wss | m | D 0-483 
20 0222 | 0381 93399 | 0369 0-403 
25 0-183 ox | ies | DER 0-346 
30 0-157 0-203 0-241 0-274 0-303 
29 0-121 | 0-158 0-190 0-218 0-243 
60 0-0836 0-110 0-133 0-154 0-173 
80 0-0638 0-0846 o107; | 019 0-1345 
100 0-0515 0-0686 00835 | 0-0972 0-1100 
130 0-0400 0:0535 0.00522 | 00761 0.0864 
" | 7 
180 0-0327 0-0437 0-0535 0-0626 oor | 
200 0-0263 0:0352 0:0432 | 0-0506 0:0576 
300 | " | 231 0-0291 0:0342 0:0390 
= | 0-0176 | 0-0237 X 
500 coins | 0013 0-0176 0-0207 0-0237 
1000 | 000535 0-00719 0-00888 0-01045 0-01195 
Es. | ji a 
i (b) Upper 195 points 
C 
m | 
2 3 4 
n N 0 1 | ^ 
p d. s J secte — — 
5 745 0-787 0-817 0-839 
«6 0-745 , T 
10 m | 0.544 0-597 0-038 0 x 
15 OT | 0-425 0-476 ooi pr 
an nam | E 0-394 0-433 n 
25 ne | 9.293 0-336 0-372 mus 
30 207 | 03254 0.293 0-326 s 
e 4 
| 2 
, 0-261 0-286 
40 y 00 — | 0-232 | f 
60 $13, | ono oas | OB, |o pue 
2 0-0852 0-1080 01273 gns 0-1319 
o oosoz | 00878 0:1038 0.0930 0-1039 
Y8o 0-0539 0.0085 | 0-0813 
16 | 0668 0-0765 0-0857 
D 00441 | 00562 s 0-0619 0-0694 
200 oosss | 00459 pee 0-0419 0-0471 
300 bet | 00305 0-0365 ous 0:0287 
500 — | 00144 9.0188 | (0221 Soas 0-01448 
1000 0.00725 0.00930 | 001114 eco 
i 0 


126 Distribution of largest or smallest root of a matrix 


three significant figures, in order to ensure sufficient accuracy for interpolation. For testing 
the equality of variate means from different populations, Table 1 should be enough up to 
m = 4, but if anything beyond this is needed one can obtain the significance level easily from 
the expression in (6) for small values of m. 


Table 2. The error of approximation at the 5 % level (i.e. difference in value 
between (3) and (6) for s = 2 at the 5 % significance level in Table 1) 


m — 2 difference m= 4 difference 


| 
| | 
n (approximate — exact) | n | (approximate — exact) 

| | 
wu ree | | 

10 0-00002693 5 | 0-00002574 

30 0-00002697 100 | 0-00003477 
so | 0-00002757 S 


4. THE DISTRIBUTIONS OF THE LARGEST AND SMALLEST ROOTS FOR 
S = 2 AS HYPERGEOMETRIC SERIES 
The joint-probability density function of 0, and 0, is given by 
P(r, 92) = O2,m,»)(6,0)^(0—6,0—6*(6,—0) (0<0,<0,<1), 02 
where ' 
C(2,m,n) = I'(2m + 2n.-- 5)/ (22 (2m +2) (2n 4- 2. 
Now consider the transformation 


A60, 0,— 6, (13) 
Then P(A, 03) = C(2,m, n) 3"+2(1 —O,)" (1 — A6," An(1 — À), (14) 
where the range of variation of A as well as 0, is 0-1. Now integrating out A from p(A, 02) 
9(63) = C(2,m,n) 60 — 0, [^ Q — A9, ama — 334a, (15) 
0 
_ C(2,m,n) 
A Pa) = rT) ag A-0)" F(m 4-1, n, m3, 0), (16) 
where 
T(m-1, —n,m4.3,6,) = 1— (n 1), , 1(— 1) (m 4-1) (m+ 2) 01 — (17) 
m+3 ? 2\(m+3)(m+4) ^? "" 
Again Starting with (12) and effecting the following transformation 
(1-6) —1—0, and 0,—6,, (18) 
we get 1 
2(0,) = C(2, m,n) (1 -gaf K” —y){1— i —0,)y" dp, n 
0 
or 


— C(2,m,n) 
(04) = (n3-1) (n+) EA —0,)?"2 Fin 1, —m,n4- ST 200: (20) 


It may be observed that a change from 1— 6, to 0, 


» and m to n in (20) gives (16). Hence 
Pr(0,«x) = Pr (0, «x; m,n) = 


Pr(1—0, «2; n,m) = Pr(1-2&6,; n,m) 


=1-Pr(6,<1—2; n,m). (D 


K. C. S. PLAI 127 
smallest root can be obtained from the upper per- 
the parameters m and n are interchanged. 

Nanda (1948) has shown, starting from dis- 
f the largest root given in $2 can be used 
allest root by noting the relation 


Hence lower percentage points for the 
centage points of the largest root when 
The result (21) is true for any value of s as 
tribution (1). Hence the expressions for the c.d.f. o 
to obtain the lower percentage points (5 °/, or less) of the sm 
(21) and substituting s for 2. 
5. SUMMARY 


An approximation to the cumulative distribution function of the largest (smallest) root of 
à matrix in multivariate analysis for obtaining the upper (B % or less) percentage points for 
the largest root (lower (5 % or less) percentage points of the smallest) is given for any number 


of roots up to 5. Percentage points for two roots for small integral values of m have also been 
presented, The distributions of the largest and smallest roots are 


also developed as hyper- 
geometric series in the case of s = 2- 
I wish to acknowledge my indebtedness to Prof. S. N. Roy for his kind advice and 
M - sations. I am also indebted to Prof. H. Hotelling for 


Criticism ; 
irum in the course of my invests - EIS 
S active interest and valuable i the preparation © his paper. 


REFERENCES 


Lond., 9, 238. 


Fisug 
HER, R. A. (1939). Ann. Eugen.» 
9, 250. 


s 
U, P. L. (1939). Ann. Eugen Lond., 


N 

Naar DEN (1948). Ann. Math. Statist. 19, 340, ing 
i . 3, 179. i i 

ho P in pes e Bea Function. Cambridge University Press for the 


cv K. (1934). Tables of the Incomp 
lometrika T l 
Trtar, K. C. S. GH, On Some Distribution Problems in Multi 
Par tuto of Statistics, University of North DART 
ov. § K. C. 8. (1955). Ann. Math. . 26, 117. 
ov, S. X (1939). Sankhyà, 4 381. 
Roy, S. N. (1945). Sankhyā, 7, 133. " 
OY, SN (1953). Ann. Math. Statist. 24, 220. 
> S. N. (1954). A Report on Some Aspects 0, 
of Statistics, University of North Carolina. 


variate Analysis, Mimeo Series no. 88, 


Multivariate Analysis, Mimeo- Series no. 121, Institute 


TESTS OF SIGNIFICANCE FOR THE LATENT ROOTS OF 
COVARIANCE AND CORRELATION MATRICES 


Bv D. N. LAWLEY 
Mathematical I nstitute, Edinburgh 


2. RESIDUAL LATENT ROOTS OF A COVARIANCE MATRIX WHEN THE 


TRUE VALUE A IS KNOWN 
Suppose that x, V5, ...,€, are p variates following a multiva 
true covariance matrix A, = [a]. Assuming that the first k 
in order of magnitude, are distinct, we make the hypothesis th 


riate normal distribution with 


es of freedom (corresponding 2$ 
a rule toa sample of size n + 1). Let; (i = 1,2,.. -; P) be the latent roots in order of magnitude 
of A. It is assumed that n is sufficiently large for us to be able to set up an almost certain 
correspondence between the first k of the l; and the first % of the Àj. More precisely, We 
require that sampling errors, which are of order 1//n, should be small compared with the 


differences betw, -Àp A. The degree of uncertainty i! 


t appreciably the approximations developed in the paper. 


Se as an approximate y?, with 4(p—k)(p—k+)) 
degrees of freedom, the expression 


—log, (lyy laa... L[Ap-*) 4. UTE TEE L)JÀ — (p — k), d 
(It is easy to verify tha 


o = M in his expression of III Ly 
r will be determined by finding the expectation of expression (1) 


To simplify the pro 
tions are zero, i.e. that A 


A 0 A A 
A = ms 11 12 
[o a]. 4 [an an} 
where A is the diagonal matrix with elements Are Aye 


ep a 


D. N. LAWLEY 129 
ions of sample values from expectation so that, 
61; = l;— A. All such quantities are of order 
f sufficiently high degree in 


We shall use the symbol 6 to denote deviat 
for example, SA = A — Ag. Say; = Qij — ij: 
l| n, and in what follows we shall be able to neglect terms o 
the s. Partitioning 5A in the same manner as A, we may write 

Ay = A+A 
Agy = M + ôA 
Ay = ĉAi2 
res of the elements of 04,5, the residual latent roots 


If we are prepared to neglect the squa | 
9f A could be taken as being approximately equal to the latent roots of Ago. This would not, 
however, give us sufficient accuracy, since we wish to evaluate the expectation of expression 

the same latent 


(1) as far as terms in 1 [n*. To accomplish this we define a matrix B having 


Toots as A and given by B= (+B) AG +B); 
Where ea | Ey rx , 
T 
E = (A-M) ôA 
0 [t DS 
Ay-Az A7 Ax 
Jay. bs, 
> 0 
By = | Ñ= Aa As 
bas x. 00 


yank Ans 
We now write B= (E) (4s 94) LEY 
ALL (E4, Ao 94) 4 HOA} - EE h 
E.A-AEQ49An 9 | 
‘nd we note that BA,— AE +54 = | 11 ^ 11 RI 


„ Hence 
al matrix with elements 6441; NUT: 1 


: tao; ++ 
pan Ey, A-A+ ôA 188 diagon p 
ttitioning B in the usual manner, We have 
By tal 
2-[n Bal 
al elements of By 
dual roots l.i ligare 


7 are all of order à?. From 
des mue say lp are equal to the 


Where of Bio Bor% es 
à the elements 1» D$ an ài th i 
her 


ee 
a Sit follows that with errors of order 
ent roots of 


AI? sA 
Ba = Ass — sAn (A — A) A ) 1 


apAnl(A -A Aa tO). (9) 
.., k) is equal to the rth 


30445— 0A os 8Aa( 

I 4 0Ag(&- MD) 
: ges dod 
? also follows that, to the same degree of accuracy; L(r- i 


la, 
Sonal element of Bx, which is 


E Sap; 00,5005. | i 3 
Oa; e iu]. 9050001 +0(54), (3) 
» (2) +e (CREE STORY 
r i 


is to be OV ]ues from 1 to p except r. 
n is to uu 


y 
A+ 0a, + Y [ess — a, 


i 
i er all va 


here Sy > : . 
, indicates that the summatio 


130 Latent roots of covariance and correlation matrices 


It is easy to see that errors of order ôt in the latent roots las Uis; -+ lp will give rise to an 
error only of order 6° in expression (1), so that in it we may replace the product of the latent 
roots by the determinant of B,, and the sum by the trace of B,,. In the ensuing algebra it will 
be convenient to put A = 1 and to replace A, by A,/A at the conclusion. The expression (1) 
then becomes —log, | Bs; | + tr (Bay) — q, 
where q = p—k. 

Now, writing B5, as A5,— Z, where Z is given by (2), we have 

| Br. | = |An—Z| 
= | Ags | |Z- AZ] 
= | Ase | | L—{L-8Ag9 + (6A44,— ...) Z |. 


From this it follows that 
log, | By, | = log, | Ase | — tr (Z) + tr (845) Z}— tr (0454)? Z) — 4 tr (Z9) + O(95). 
We also have tr (B3,) = tr (Ag) — tr (Z). 


Hence the expression (1) may finally be written as 


[log | 4s; | + tr (Age) — q] — tr (0455) Z) + tr {(8A gg)? Z) +4 tr (Z2) + Q(» (9 


The part of (4) within the square brackets is merely what would arise if we were testing 


the equality of all the latent roots of the covariance matrix Ass, of order q. Using Bartlett's 
result (IIIa) we can evaluate its expectation as 


IS I 2 1 
b Dens l+- +o). 


To find the expectations of the remaining terms of (4) 


, and of other quantities occurring 
later, we need the results exemplified as follows, where q 


uantities of order 1/n? are neglected: 


E((0a,,)9) = 8Aj/n?, E(Ba,(0u?) = 2AZAg/n?, 
E(90,,90,,94,). = A As As[n?, E((02,,) = 122dn?, 

E((0a,,* (255)? = 4ATAs[m?, E((8a,,)* (9a,5)*) = 241 As[m?, 
HX (8443)? (Jas). = 232A, An, E((0,,)5) = BATAg/n®, 


E{(8ay9)? (92,5) = AIAsAs[n?, E((8a,5)* (9a5,)?) = 944323 Aan. 

Terms of other kinds which arise al 

By making use of the above res 
order 1/n? we find that 


+(® A E 
Fi (84,) 25 = BE +S op 
= qd(q 1) %,/n2, 
Eir (94,3? Z)] = q(q- 1) Zi fne, 
Hite (Z*)] = tr (34s, (A — 1) 4,7] 


l have expectations of order 1[n? or less. f 
ults, substituting for Z from (2) and neglecting terms ° 


= {2i +qlq + 1) j/m?, 
N A, wifes id 
where 2 -E (= jJ 4 2, M (= 1 : 


D. N. LAWLEY 131 


Hence the expectation of expression (4) is 


J t otl 3 ) 
lq(q4-1 i NEP RECS EE 
sa "s gt id q+ ji “Dne Dit (q+ VEZ} 


1 1 2 1 k il 52 
= la+) |- +(+] -È L7 
dn JE iu ratal = 3l 


ect multiplying factor for (1) is 


Thi 
his means that, for general A, the corr 
l 2 i! kA B n 1 
regain dle oor 
il gtl- ry) Ge Dla WA ear Wee 
ated values I, would usually have to 
are all large compared with A then 


I IPorsati ; 
ia ai actical use were to be made of this result, estim 
ubstituted for the true latent roots. Tf Ay, Ags Ax 
‘Se A Ve : a 
(A,— 23 is small, = x r x) 5 approximately equal to k and the multi 

b u r ai 
ecomes approximately 


plying factor 


N 3. Tur CASE WHERE À IS UNKNOWN 
E consider the case where the hypothesis is still that the smalle 
ve equal but where the common value A is unknown. The approxima 
E p the hypothesis now has à(p—5— 1)(p-k-2) degrees of freedom, 
re, and may be expressed as 
—log, (Uia len en lp) +(p- k) loge (la +t ee Er. 1,)|(P = ky}, (5) 
multinh 
ultiplied by a factor, which when k = 0 has been shown by Bartlett 


1 2 

n-}(p+1+5) " 

6 p 

re and to the same order of approximation (5 


st p— k latent roots of 
te X? used for 
one less than 


(his case IIIc) to be 


) may be 


Wi 
ith the same notation as befo 


Teplaced by 1 | 
—log, | Boz |+4 log, t ir (B2»)j . 


" 
Ince, putting A = 1, 


i 1 
glog, t tr Ba) —qlog, F tr As] 


biens] 


aU SA 


— log, 


jt Z)- go (PZ 009) 


-— P ; (949 (tr Z)— 


wo)? Zh + hr (A) 


n o 
© expression (5) may be written 25 
4+[- (9A 22) Zt ur (94 
-qtr ay 4009. — (9) 


| tos, | Ass | - loge F tr wi 
Ll (tr dag)? (602) ag 


pe 84,) (6020) 7 E 
9-2 


132 Latent roots of covariance and correlation matrices 


The terms within the first pair of square brackets are those which would arise if we were 
testing the equality of all q latent roots of Ags, the common value being unknown. Their 
expectation is therefore 

Ha- Daso (2114 2) i of). 
2d EX t ess pe q/] Wn 


: d 
The expectation of the terms in the second pair of square brackets has already been foun 
to be 


q A " 
ay CA (q+ 1) (22, — X3, 


where X, and X, are as defined in the preceding section. 
We now find that 


D t k 
Mieta A +B ac) 
= 2gX,[n?, 
E[(tr &4,,)? (tr Z)] = 2g3,/n2, 
E(t Zy] = Eftr ôA (A— 1) 44,49] 
= (q?33 + 293) m?. 


Hence the expectation of the terms within the third pair of squ 
(Z_— 22, — $qB2)/n2, 


and the expectation of the whole expression (6) is 


are brackets in (6) is 


1 1 2 
Ha Ma) eg ore1e2 ag - 1) q--2) Q9, — X)Jn* 


te Den rs one +2) te E J. 


-X4—u 
Ng r=1 (A,- 1)? 


Thus, for general A, the correct multiplying factor for (5) is 


M LANDE AES a 

nt T ee 
mates would have to be substituted for the Jv 
mated as (1, , -- Lis +.. +1,)/q. TEA Ag AE 
uld be omitted and the approximate value 
rk =0 by substituting n—k for n and p— 


e to show that in each case the variance of y? is correctly 
'oximation; but the algebra is rather laborious and is for thi M 


A represents that part of the variance of each va; 


"I1 


D. N. LAWLEY 133 


ther i 
ail LE of the hypothesis that the residual latent roots are equal to A implies that 
om = ual variance is the result of error. If, on the other hand, A is unknown, all that we 
; . 4 E A ? = 
nfer from the equality of the residual roots is that the residual variance is the result of 


specific fi istingui 
‘actors and error; we cannot distinguish between these two sources of variation 


4. PARAMETERS OF THE DISTRIBUTION OF THE FIRST b LATENT ROOTS 


It is per s 
s perhaps of some interest to consider briefly the expression (3) for the rth latent root 


L(r = 
„(t = 1,2,...,k) of A. Taking the expectation we have 


A.2,( _Ai 1 


n i=l 


where Y’ is a . > T A 
as previously defined. Similarly for the variance of l, we have 


ah 1e (sel eo( 
n P Mint AA pro Z) 


and : : 
the third and fourth cumulants of I, are given by 


Tt : 
may also be noted that the covariance between I, and l, (r+ s) is 


TE E Luft 
XX) «ois 


a bias only of order 1/n?, is provided by 


A better estimate of A, than l» having 


A T A j- À | 
Rada L-u n I-A)’ 
take the value r and where an estimate is substituted 


Where ; 
re H " 

in the summation ? does not 
his is found to be 


or o 
Aif necessary. The variance of t 
2x, += EDC] 
n Ti Ag Ài n? 


on of A is to that of a variance estimate with 


This gi 
s gives an idea of how near the distributi 
&rees of freedom. 


S OF A CORRELATION MATRIX 


latent roots of a correlation 
lity, assume that the true 


5. RESIDUAL LATENT ROOT 
»roble 


W 
pn les finally, the more difficult pro o 
Varian, - To simplify matters We may, without loss E one ae 
i 1ces of the p variates v; are all unity; $0 that the true correlation matrix 18 equal to the 
` tance matrix A, (not now diagonal). Assuming that the first k of the latent roots A; of 
E. Matrix are possa we wish to test the hypothesis that the residual p — k latent roots 
qual, the common value A being, however; unknown. 


m concerning the 
genera 


ho 


134 Latent roots of covariance and correlation matrices 


i i 2 ^ atent 

Let R be the sample correlation matrix and let l; (i = 1, 2, ..., p) represent the 5 ge 

roots, in order of magnitude, of R. Then the criterion used for testing the above hypothesis 
(corresponding to Bartlett's case III c) may be written as 


n[—log, (0... lee -++ lp) + Cp — E) og, (luis + bese D) (o — 1]. (7) 
where we have taken the multiplying factor simply as n, since this is sufficiently accurate 
for our purpose. itl 

The difficulty here is that the criterion does not, even asymptotically, follow a Sih 
tribution, though it will approximately do so if Ay, Ag, ..., Aj are large and A small. Even 
then, as Bartlett has remarked, the effective number of degrees of freedom depends on the 
amount of variance removed from each variate by the first k components. Our object is to 
determine the effective number of degrees of freedom in the general case, in other words 2 
find the expectation of (7). We shall content ourselves with a lower order of approximatior 


than before and shall neglect terms of order 1 [n or less in this expectation. 


Let V and ôV be the diagonal matrices whose ith e 


lements are respectively a;; and ótj;: 
Then 


V =I+0V, 
and R= V-iA y=. 


Let Q = [(Q,:Q;] be an orthogonal matrix whose first % columns, Q}, represent the enr 
vectors corresponding, in correct order, to the latent roots Ay Ag, «++, Ap Of Ay. This means tha 
A 0 
Ae-q L|. 
where A is as previously defined. 
Now the latent roots of R are the same as those of A V-1 


Q'AV-3Q = Q'(A, +04) ([+6V)Q 
= Q'AQ--Q'84Q -Q'A VQ 4- O(09), 


a matrix whose non-diagonal ele: 
these, the residual latent roots 
sub-matrix consisting of the las 


, or of 


ments are all of order 0. Hence if we neglect the squares of 


Lists lko +51, may be equated to the latent roots of the 
t p — k rows and columns, which is 


H = M e QA — ASV) Q, +0(82). 
The substitution ofthe de 


terminant and trace of H for the product and sum of the residual 
latent roots in (7) willinvol 


c ve an error only of order 6%, so to this order of approximation the 
expression (7) may be written as 


n| —log,| H |+(p—k)log, tra). 
dios ms 
This is equivalent to 


n r 
go [tr(Q40A — Av) Qj — (gg n 0164 - 287) 9,324 o) 


= apt Cea -A8Vy- È L 5 (tr (A — ASV) + oo») ; e 
where C = Q,Q; = 1—QQ:. 


In order to evaluate the expectation of (8) we make use of the well-known result 


NE (Say, 90) = Ope Ons + gs Ohya, 


Dn 


Zum og prm 


! 
| 


D. N. LAWLEY 135 


wher " , š 3 
ere, as before, ~;; denotes the true covariance (or correlation) between z; and z. We then 


find th: 
E nE(tr (COA)?} = (tr Cg)? + tr (C40)? 
= Jtr Cp Arr C 
= AX p—k)(p—k+1), 
nE((tr CàAy) = 2tr (CAS? 
= 22 p- E). 


p 
nE(tr (COACSV)} = 22 Y Cin 
nE((tr CàA) (tr cav) = 20° z i 


2 


2 
jdij 


- 
[m 
LU 
- 


m t2 
TMs 
Ms 


nE{tr(C8V)} = 4 


2 
Oi 


nB{(tr C9VYS cient 
heus s 
where c;; is the typical element of C. 


Using these results and neglecting quan 
(8) to be 


iiij 


t2 
-M 
Shl 
2 


tities of order 1/n, we find the expectation of 


-)AX(i- (p—K) X E (ijai) + X X (c;;05,905))- 
s (9) 


es of freedom for the approximate x”. 


e number of degre 
e correlation coefficients æ;; would usually have to be 


Ko-&- D(-b2)- 5 got 


This expression represents theeffectiv 
In practice the quantities A, ij and th 
Teplaced by sample estimates. 

In the limiting case where 
degrees of freedom becomes 4(?~ 


asymptotically equivalent to 


ll correlations tend to unity and A—0, the number of 
* pal) (p—k+ 2). Furthermore, expression (8) is then 


1 r 2 

: (0:409). 

L ((9,940)5— c=) (9,5400) 

-ig so because the 4( p-k)( p—k+1) distinct elements 

n ally and independently distributed, the 

e asymptotically norm EAR elements having variance A2/n. 
e correlations are equal to 


tt (1951) all tru 
ace for equality after the effect of the first 
t Apis b+ (p- 1)p, the residual roots are 
9 ^ 


LIT 1,...,1). Hence putting 
4p 


"n 
DU 


t true latent roO 
ing to À 
9l11— p and the latent column vector Qı corresponding $0 ^1 
_—1/p (ij) we have 


=P, Oy, = 1, a = P: C = (p— DIP: ea Z 


EY (Cuti) = 
11 


136 Latent roots of covariance and correlation matrices 


and the expression (9) becomes 
~1)(p—2 
He-2)(p+)- PaO) py, 
This agrees with the value found by Bartlett. 


Iam indebted to Prof. T. W. And 
‘Asymptotic theory for principal component analysis’ 
date. In it he suggests that a better approximation to the distribution of the criterion 
considered in this section, would be obtained by using cy2, where c and d are constants and 
where 42 denotes a X? variate with d degrees of freedom. In order to fit the constants 
c and d it would, however, be necessary to find the variance as well as the expectation of (7); 
and this would clearly be an unpleasantly complicated expression. 


erson for showing me the draft of a paper by him on 


» Which is to be published at a later 


Note added in proof. The Conjecture that use of the correct multiplying factors for 
the criteria of §§2 and 3 would make all moments agree with those of X?, neglecting 
quantities of order 1/n?, is now known to be correct. This follows from a general result, 
to be published later, concerning likelihood ratio criteria. 


REFERENCES 
BARTLETT, M. S. (1951). The effec 
Biometrika, 38, 337. 
BanrrETT, M. S. (1954). A noteon the multiplying factors for vario: 
Soc. B, 16, 296. 


t of standardization on a X? approximation in factor analysis- 


us X? approximations. J, R. Statist- 


[ 137 ] 


THE VARIANCE OF THE MEAN OF SYSTEMATIC SAMPLES 


By R. M. WILLIAMS 


Applied M athematics Laboratory. D.S.I.R., New Zealand 


1. INTRODUCTION 
a variety of fields, notably in forestry and 
he field and their advantages over 
e was justified 


wistomatio sampling methods have been used in À 
Mur Ogical studies, because of their convenience in t à : 
àndomized scheme if the survey is also to be used for mapping. Their us 3 

Y Work such as that of Hasel (1938) and Osborne (1942), who examined the sampling eTLOrs) 
0 random, stratified random and systematic surveys, by subjecting detailed forestry 
Survey data, to analysis by these three methods. These analyses showed that the systematic 
€signs were generally the most efficient. They left unsolved the problem of estimating the 
qr of the Systematic sample, although Osborne's paper gave 2 method based s computing 
© mean square al corselatioft which gave reasonable agreement with experimental 


pos 1 d th 
More ace = 48), also using data from forest surveys, has compared the 
ariances of jede (computed by the overlapping a, P ae bis opt 
reasonable recisi e dom samples and stratified random samp'es wi Leder iin 
members 2 rests nda ee Finney showed that the variance of the systematic 
Sample differ aR LIE poe stratified random sample of the same size with 
red little from the varie r than the variance obtained with 


0 adi smalle: 
xa Servation per stratum, but was WE stratum, which appeared to be the most 
"va ; à 
‘he number of strata and two Oe eaten of the variance without supplementary 
an unbiase of the variance of systematic sampling with various 
ation 


V; 


e ic 
Ee System giving 
"mation, Further investig 


mo ; 
dels Seems likely to be useful. nce for a number of particular cases and gave 


= (1948) examined the eee jnformation, provided an estimate of the sampling 

i m (p. 362) which, with supplem' tic sample, OF alternatively, subject to certain assump- 

ve of the means of a osito es owett (1952) gave a method for the determina- 

io 8, provided an upper limit to en sample, 0n the assumption that the observations were 

n of the variance of a systema™ eis wed derived from earlier work by Cochran (1946) 
acess. 

field. " 3 r 

oe method similar to Jowett’s, but involving rather weaker 

lop r the variance of samples with spacings less than 

a (which is not covered by Jowett’s method); this 


g varia 


Ass: 


hee to that of the ds variance iS 
$ udes the case where pen le that a systematic sample cannot by itself provide an 
the often-stated po assumptions which are made about the population are equivalent 


est: i : s 
Mate of error, since the ?* ;. Many alternative assumptions eould have been made, and 


o ma n 
th 


m 


Supplementary infor ticular 
© justification of th? F d in the cases 
Vide variety of data Md 


* e 
] Xperiment. 


considered below yield results which agree with 


138 The variance of the mean of systematic samples 


These investigations arose out of discussi 


ons with officers of the North Canterbury 
Catchment Board (New Zealand) on the sp 


. = : : rations 
acing of points along a line on which observatio! | 
were to be made to determine the percen 


tage of ground with various types of cover. The 
main part of the theory is applicable to any series of observations spaced out uniformly 1 A 
time or space which fit the model described below and holds for both discrete and continuous , 
variables. 
We shall, for convenience, always refer 
which observations are made, and to the 
are given in $4. 


to the transect, meaning the whole line E. 
points on it. The details of the experimental wor 


2. FORMULAE FOR VARIANCES 


We regard the transect as composed of kn points; the sample is drawn by taking every ki 


ry (6 1,2, ..., en). The 

: k) of the mean of a Systematic sample (defined "n a 
ple mean from the population mean 7) draw? p 

W & Madow (1944) in the first of a series © 

ay be derived in a convenient form from E $ 


from a finite population is given by Mado 
three papers (Madow, 1949, 1953), and m. 
usual partition of sums of squares and the 


relationship 
kn kn 
r Ecc - PI 2) (v; —ay?[kn, 
3 i= i=1j>i 
p E TTE a L ^-lk(n-5) 
Xihich given A telam np Ec à (i= uaa 
-(fO, v (o... 1 
= est) - eu, ( 
kn » 
where FO) = È (x —a)?/kn, 
i-1 
=  l1^-1] kas) 
tn => 


x 2 
nac, kn 2 (z;—a) irs — a), (2) 
] kn=1 ] kn-y 

kn à kn à Gia) (ip —a). 


Sun = 
Here a is any constant. 


We let k tend to infinity and define s as i/kn, (S) as the corresponding value of v; and 
gas. 


90 = [eo-a 66--0—)4, "un 
$0 - |. 


t 
u ô 
[i A 2) 95 
Ts [ dtt = 3. | 
(0) $(0) 
and thus o2(n, k) cl +2$,—26 = oi(n). 


In the limit Incl ( ) 


Gi 
* Jowett (1952) gave this fo 


rmula in a slightly different form, Quenoy; 
when 7 is large as well as k. uille (1949) gave the 


for? 


Y um alternative form for o2 
i ; i “ti i 
ollows that if ¢(ż) has continuous deri 


n l à 1 1 
PECH = I, pl) di+ = (G0) +9 


9 1jn 2d 
Fig. 1. Graph of 9() showing shade 


Where $57) denotes the ith deri 


Assuming that the series conve 


vative, Bs, 


R. M. WILLIAMS 


(n) is obtained from the Euler-Maclaurin theorem from which 


vatives up to the (2g -- 1)th order 


ap- $ 2 L geo - 9-0) 


yar (27)! a” 


4(—1)4 I "Bao deem) de 
near? 0 ex m. ? 


3 t 
t [n 0 


d areas proportional to o2(n). (Not to scale.) 


are the Bernoulli numbers and 


© sin (lrx) 


Pages®) = 2S sinet 


rges We have 


_ oo 


o2(7) 6n? 


n that f(t) i$ diffe 


The assumptio oin 


Ochran (1946) and regard the © 


(0) _ gO(1) = 90) | $91) — (0) 


360n* 15120n$ 


139 


(8) 


-eerentiable to the required order is one that in general will 

ts in a particular transect. If, how: f 

Ne ME ) . If, however, we follow 
t be fulfilled bY qua ces pservation of a particular transect as a sample drawn from 


140 The variance of the mean of Systematic samples 


a multivariate parent population, then it will often be reasonable to assume that the 
expected values of P(t), ete. (the expectation being taken over all possible drawings from 
the parent population) do have the required property of differentiability even if the 
individual values are discontinuous. We shall therefore, in future, consider the expected 
value of the variance, etc. Since the formulae are unchanged if we replace as(n), A(t), etc. 
by their expected values, and there is no danger of confusion, we shall use the same symbols 
for the expected values. Where we wish to emphasize the distinction we shall refer to the 
two values as the sample values and the expected values of a(n), A(t), etc. It will be seen 
that the distinction, although essential where equation (5) is used, is of less practical 
importance in the other case. 


Variance 


Int, Spacing 1/n=1 


Fig. 2. Relative values of variances for monotonic decreasing A(t). (Not to scale.) 


By similar methods we find that the sampling variance of the mean of a random sample 
of n members (o7(n)) is given by 


o2(n) = * (4(0) — 29). (6) 


_ The functions involved in (4) and (6) are shown in Fig. 1 fora Monotonic decreasing 9 (P: 
$ is given by the area, under the curve (t); $(0)/2n.-- Z, is the area ofthe polygon OABCD «+» 
80 that o2(n) is equal to twice the area shown shaded.* 


* This graphical method is equivalent to that given by Jowett (1952). y, Should however be noted 
that Jowett’s assumption that the »^s form a stationary time serjeg is not invoked 


P -~ 


R. M. WiLLIAMS 141 


a ek follows that provided the expansion is valid o3(2) initially varies as the square 
For su Selen - /n) between observations; on the other hand, o2(n) varies linearly with 1/n. 
More efficier ait arge n, c2(n) will therefore be greater than o2(n), and the systematic sample 
If g(t) is s : than the random. | 
Points poen D changes linearly beyond some point to ; 
he vari TAEL OSR beyond ty, os() will increase linearly with 1/n (see Fig. 2). 

the same t SE e; s(n) is a measure of the variability of the mean of all samples drawn from 

ransect; if we denote by 73,(”) the mean square of the deviations of the mean of 


nean at the point t of the parent population from 


then as the spacing of the sampling 


& sam 1 
ple about | u(t) dt, where x(t) is the 1 
Which th ` b 
e points of the transect are drawn, then 
e3(n) = a(n) 4- 29", (7) 


1 
p(t) dt (see Fig. 2). This variance is of 
0 


x put equal to Í 
Interest : i 1 

if we wish to determine the significance of suspected changes In eat, and we 
ance matrix of the 2’s remains the same. This would 
transects considered in this paper, the gradual 
ther might lead not only to changes in the pro- 


ovariance matrix. 


Where 5s 
$9' is the value of $ when a is 


are 

ila Resume that the covari: 

eplace Lese dui for example, in the 

Dortio ent of one species of plant by ano 
NS of covered ground but also in the c 


3. ESTIMATION 
B , 
esti L. To determine o2(n) we must; ideally, know 9( 
pu this from a sample 2; (" = 1, 2, ..., m) of m pom! 
first Point randomly located in (0, 1/m). For such a sample we 
m-u 
: X (v — a) (Ertu m a); 


y(u[m) = mow r1 


at all points in the range (0,1). We 
ts spaced uniformly 1/m apart with 
define 


(8) 


m) is an unbiased esti f plum): 
a stimate 0 : i 
We can Mn - n ecific form (e.g. exponential) for ġ(t) we can fit this to the observed 
p ifficiently low to ensure that they are based on 


]ues of u SU s 
(n) either graphically or from the series form. 


ration of A(t) presents no difficulties for low values 
fficients at t = 0, required for the series form, can 
ĉasily be determiners butasi approaches unity the number of terms involved will þe too 
alito ies E : states. For the graphical form the same objection holds, 
in addition it is desirable to avoid the e ee od a pads onini ene 
then m the graphical method no further contribution 

bs In practice it will probably be sufficient 


to mee is linear above some value fy p " 
$(n) will b d 1) beyon : 
t e made by continuing A(t) ffectively linear over lengths equal to the greatest 


Conti iti 
mee A(t) to a point such that it is € 
ing intervals likely to be of interest i ideri 
r y or involved in considering only the contribution 


Ssuming the expansion is valid, the err 


Va 
x of y (u[m) preferably for Y? 
Be number of points, and determine 
ote here this is not possible the determi 
and consequently the differential coe 


nine 97 


to t i d b 
o is given PY 2 B. [ gon) 9*9). 
2 (27) v? 


142 The variance of the mean of systematic samples 


If the series form is used some equivalent assumption is necessary. If it is assumed that 
A(t) is linear from t = to onwards, then ó'(1) = — 9(19)/(1 — tj) and all higher derivatives are 
zero. An alternative to this is to assume that in the region (0, 1 — ty), #(s) has a mean z4 and 
is independent of x(s) in the region (tọ, 1), where the mean is ji; then for (19 1 


A(t) = (1—4) (4, — a) (gu, — a). 
Visual inspection of the observations should provide 
assumption, and ¢(¢) can then be determined directly 
end of the transect. Even if this assumption does not 
very sensitive to deviations. For example, if It 
to a different linear trend of the form 


some check on the validity of this 
from the means at the beginning and 
hold exactly it does not appear to be 
and i, are not constant but each is subject 


‘han y= 4 +Pyt, py = Q+ Bot, 
P(t) = (1—1) (c4 —a) (atha) t S Aet Aaa) Ayla, —a)) — a a pfe 


which gives (1) = — (e —a) (e 4- f, — a), 
$9(1) = f, fy. 

The assumption of constant means in the regions (0, 1 — fo) and (tp, 1) gives 

a UA Asa) -balon —a)] + E p 

so that, if t is taken as large as possible consiste 

sample, the error will not be large unless the tre 


$01) = — (e4 —a) (42+ f — a) — 


nt With basing the means on a reasonable 
nd is marked, i.e, A, and f, are large- 


may want to calculate o?(n) for values of n grea r 

mean of a sample is to be calculated from the sample itself m Ý ^ 
When n is substantially smaller than m, say half or less, the 

the series form will converge slowly. When 7 is eith: 

the sampling fluctuations in determining the val 

the differences to be measured, and the series form 
The differential coefficients required at t = 0 may 

formulae, i.e. by putting 


€ order as or greater than AN | 
ues of (f will be comparable with 
must be used. 

be determined by the Gregory-Newt? 
$090) = m(Ay(0) — 3A*y«(0) .. j: 


$90) = m*(A*V(0) —$Aty(0) ) 


where Ay(0) = v) —V(0), 


2 


A34 (0) = »(z) x sw) +(0). 


Since the sampling variance of A^//(0) increases rapidly With 4 it will be better to use only 
a few terms in estimating (0), &9(0), etc., and for the same reason to u Ae few terms 
in the actual expansion of g?(n). This means that the series expansion rs y iu y eful when 
n is large enough to ensure rapid convergence, since otherwise Serio Wr In 
many practical cases precise estimates of the variance are not. needed 
may be tolerated. 


us bias may occur: 
, and fairly large error? 


R. M. WILLIAMS 143 


is When the observations are subject to a periodic effect it may cause serious errors if 
i period is close to 1/m or a submultiple of it, since in the graphical method it will not be 
justifiable to draw a smooth curve through the calculated values of y(u/m); the series form 
may not converge sufficiently rapidly to provide reliable results. If the existence of a period 


can be expected from the nature of the data, then this difficulty can be avoided by making 
l/m a sufficiently small fraction of the period. Finney (1950) gave an example of an 
Unexpected periodic variation occurring in a forestry survey, for which there appeared to 

e no simple explanation; the only safeguard against effects of this type is to take the 


Observations so close that it is at least unlikely that the periodic effect will be undetected. 
as the value 0 or 1, equation (8) has a very simple 


3-4. In the particular case where 2(s) h 
umber of 1’s followed u places later by another 


Orm i 
1 m if we put a = 0. Writing v, for the n 
We have i 
x) Vu 


v(a F 
Tn this case we can set limits on the gradient at t = 1 since 
o<ġlt) < (1-8) 
80 that -1«$9) «0. 


m 


at the extreme ends of the transect 
then $1) pip», where p, and p, are the proportions of Is at the beginning and end of 
-— z» W. Py 2 
e transect, ! iffer h. 
fm is large so that we can neglect all but the first differences, we have 


m 
eso) Tap? o) 
Where, ; s 0’s in the set of m observations. 
; m, is the number of runs alid if m is so large that, because of the high correlation 
S approximation is only V is altered relatively little by small changes in m. 


etw, a 1 er of runs 1 5 j AA 
We een adjacent points, the pami er of runs in the series obtained by omitting alternate 
can, for example, find the nun 


this form as a quick estimate of tl 
Obseryat; ile from m; we canjuse q of the 
Vations; if this differs little fr a 


Variance, 


of 1’s and 


4. EXPERIMENTAL RESULTS 
The experi t consisted of a record. of E cover which was classified as 
tt ine o r de i i io Mm i t heryegetauion comprised a portion 
Of the DPE seen sich that occur on 1 xfs YA nino area E ay de He 
divide co tussock PAS of the South emm : ^ re oie t" jeld work was carried out 
Toti, composed " an ge se to some 7000 ft. above sea-level. 


ard grey i 
: f T g dio tussock grasslands which have been divided by natural 
ta century sheep hav? or ensive blocks. 


zed 
’rriers and fences inig ere analysis was used and line transects of 10 chains and more 
ER point method ^ an egged. The point analyser consisted of a metal frame 20in. 
^ ength were perman of steel pins 2in. apart. At the same calendar date in the summer 
8 which carried 2 7 as placed at a series of adjacent positions along the transect and 


ach year the analyse ‘ 


144 The variance of the mean of systematic samples 


Table 1. The number v, of points followed by a point with similar cover at spacing 


| Tran- 
| Transect 4 Transect 7 Transect 11 sect 8 
" mu " NEN V 
L D| B | wjp g L Dİ z L 
| 
| | = 
| | rq 
0 | 1570 | 201 | 2226 | 4713 | 343 | 1227 2955 |127 | 676 3335 
1 1362 58 | 1992 | 4315 | 84 | 933 | 2787 |33 | 514 2750 
2 1252 38 1895 | 4194 55 | 848 | 2718 | 24 | 456 2498 
3 1169 29 1815 | 4116 | 43 | 802 | 2668 | 14 | 424 2340 
4 1100 25 1745 | 4061 | 36 | 750 | 2635 | 16 401 2214 
5 | 1046 | 27 | 1681 | 4015 | s1 | mv | seul | a2 | agy | 2152 
6 | 974 | 20 | 1623 | 3991 33 | 694 | 2574 | 11 365 2077 
7 919 16 | 1579 3974 | 27 | 668 | 2556 8 350 3042 
8 872 19 | 1530 | 3936 | 21 647 | 2534 | 6 | 338 | 2025 
829 13 1488 | 3924 2 , 99 
9 | | * | 630 | 2518 | g | 320 | 199 
10 197 1l | 1454 | 3909 | 18 | 626 | 9506 | 4 | sja 1974 
11 713 9 1426 | 3888 | 18 | 615 | 2499 | 3 306 1962 
12 762 | 11 | 1411 | 3869 | 18 | 595 | 2478 | 4 | 594 | 1968 
13 743 | 15 | 13908 | 3855 | 17 | 586 | 5404 | v | gg] | 1974 
14 189 | JL j 3982 | 9859. | ex | set, | eai | a| om 1981 
| 
15 724 8 1369 | 3847 l7 | 581 | 9444 1979 
6 8 
16 724 9 | 1369 | 3832 | 18 | 561 | 2434 | g Fa 1966 
17 730 4 1373 | 3812 18 | 550 | 2427 7 259 1936 
18 728 12 1368 | 3809 17 | 557 93418 | 6 | og] 1927 
19 719 16 1363 | 3788 14 | 544 | 9414 | 3 | oso 1915 
20 718 11 1363 3784 13 530 2408 4 245 1936 
21 709 14 1348 3786 21 535 2404 T 243 1951 
22 704 13 1330 | 3788 | 24 | 522 | 2407 | 5 | 55s 1941 
23 693 11 1330 | 3779 17 524 2404 5 249 1945 
24 694 | 11 1325 | 3778 | 21 | 522 | 9389 | g | 336 | 1942 
25 696 10 | 1322 3767 25 513 2380 3 232 1944 
26 700 10 1326 3755 32 | 501 2387 | 5 236 1958 
27 703 14 | 1335 | 3745 1$ | 92 | 38$ | 4 | $a 1945 
28 = X — 3745 19 488 2398 4 240 1931 
29 = = ex 3746 14 483 2396 4 244 1918 
30 = -| — 3745 19 | 473 | 2808 | g 237 1910 
35 — E = = = s = EX ds 1875 
40 — — = Sm = a = = T 1826 
No. of 6283 
321 
pointe 3997 3758 63 
S RR 
Percentage 39-28 | 503 55-69 75:01 5-46 | 19-53 78-63 3-38 | 17-99 52-76 
of cover 


R. M. WiLLIAMS 145 


the stat 
e of tl " i Vi 
he cover beneath each pm was recorded; this gave a set of observations 


uniforr 
mly spa zery 91 
y spaced every 2 in. along the transect. 


ange of conditions were chosen for study; these are 


our tr; 
B ele covering a wide r 
oa 7 i 
s transects 4, 7, S and 11. A brief description of each is given in the Appendix 


We consi 
bare; the eet separately the three cases living or not living, dead or not dead, bare or not 
the Point s » a8 the most important ecologically. We describe the state of the ground at 
other cages, y putting a(s) = 1 if there is living cover, 2(s) = 0 if not, and similarly for the 
Table 2. The series for o2(n) 


Transect 4, m = 3997 
of 2 
v= m (1675-0 197 = ) 
" 


4 s 
pe (1030-0311 E 
2 n? 


4 m 2 
fae (5703-0381 ua 4) 
n? nè 
Transect 7, Mm = 6283 


10% Qm 
WH (102-820-090 7 =| 


10! (74.17 — 0:5 
(7417 — 0:592 73 2 


4 m? 
y; i (1585-017575 » 


VE. 
10* (57.53 — 0-2: il 
y, i (2788 0-239 5 


In Table 1 we give the values of v, for living, dead and bare cover (L, D, B) for the cases 


Cone: 
nsidere d 
Xamination of 


Table ! shows that for the observations of dead points for spacin 
Sater than about 5010: 6: a > 25) the value of v, (apart from random fluctuatio pt 
; uld be expected if there was no correlation (ie. (m X " 

e. (m— wu) p?, 


Appy, 
Oxi yhich WO ae 
mately that V í d). For the living and bare points there is still some 1 
correlation 


ere p is the proportio E : 
8 his a put iti cha ^c med slowly to regard it as linear for all except lar; 
bling intervals; this is equiv? ignoring any contribution to the variance mad i 
Jated- ird 


a 
lues of g(t) not t^ ula 
Biom. 43 


To 


146 The variance of the mean of systematic samples 


To obtain the coefficients in the series expansion only differences up to the third were used 
at t = 0 in order to avoid excessive sampling errors. At ¢ = 1, QO(1) was taken as Lg; all 
other derivatives were taken as zero. ] 

In Table 2 we give the variances of the percentages of living, dead and bare (Vz, Vj. Vz) ™ 


series form. The total number of points in the transect is m so that n sampling points 
correspond to a spacing of 2m/n inches. 


Table 3. Comparison between the variances of systematic and random samples 
drawn from correlated data 


Spacing : 20 in. Z5: . ; 
(2m/n in.) 10 in. . 0 in. 60 in. 80 in. 
Transect 4 
Vz, 0-604 (0-543*) 1-746 6-088 (5-852+) ildéé 16-223 (9.2231) 
VL 2-996 5:992 11-985 17-977 23-970 
Vp 0-370 (0:373*) 0-902 2-091 (3-2334) 3-330 4-568 (2:5381) 
Vj 0-571 1:143 2.286 3-429 4-572 
Vp 0-586 (0-010*) | 1-959 | 6:219 (2-059) 1188 | 16017 (19-5190) 
vý 3-090 6-180 12-360 18-840 | 94.790 
Transect 7 
Vr 0-400 (0-420*) 1-141 2-942 (6-200) 4-991 7-020 (91201) 
V; 1-496 2-992 5-983 8-975 11-967 
Vp 0-267 (0-273*) 0-644 1-462 (1-8631) 2.289 3-100 (1-918) 
Vy 0-409 0-819 1:637 2-456 3.274 
Yn 0-341 (0-310*) ES 2:331 (3-229) 3-480 5-300 (7-0051) 
Vi 1:257 2-514 5-029 7-543 | 10-058 
Transect 11 
Vr 0-435 (0-510*) 1-606 4:090 (4-4941) 7-819 11-549 (19-3661) 
Vi 2.235 4-470 8:940 13-410 cs 
Vp 0-243 (0-277*) 0-633 1:503 (1-769+) 2-444 3-386 (4-796T) 
Vs 0-434 0-869 1:738 2-606 Au 
Vn 0-437 (0:463*) 1-389 3-463 (3-426) 6-400 9-337 (13-5351) 
Vg 1-917 3-833 7-667 11-500 15-334 
Transect 8 
Vi 0-661 (0-610*) 1:791 : ui (3-176+) 8-474 11-410 (6-7441) 
V 1-969 3-939 7-878 11-817 teres 
* Approximate estimates determined from g2(5) — m,/12n2. 


f Estimates obtained from sampling experiment, 


The values of the variance obtained by dir ect integration are given in Table 3, and for 
comparison the variances V^, V and V; which would be obtained with the same sampling 
number if the observations were made at random. At close Spacings V; and V, are much less 
than V7 and V; the difference is smaller between Vp and V^, due to the fact that the dead 
vegetation comprised a comparatively small percentage of the whole and occurred more OF 
less randomly interspersed with the living cover, while the living vegetation, particularly 


MT 


R. M. WILIAMS 147 


in tr. J : P , 
ansect 4, tended to occur in clumps interspersed with patches of rock or bare ground, 
20 in. The smaller differences 


Eu a kigh values of (f) at close spacings of the order of 10— 
The iv 8 are consistent with the rather more uniform Soror: 

B oir wes of the variance calculated from the approximate formula given in (9) for 
MM a 10in. (n = m[5) are also shown in Table 3; the agreement is satisfactory. This 
to imd eillustrates a point noted in §3-2. The approximate formulais very nearly equivalent 
g only Ay/(0) to estimate (0) and ignoring all later terms in the series for o2(n); for 


a : 4 r 
Spacing of 10in. this gives a better approximation to the variance as determined by 
es form given in Table 2, which involves A24 (0) 


and A*//(0) as well. 
on the adequacy of the model we 


y and indirectly 
andomly chosen from the first 20 points on the 


nd a similar set of 20 samples taking every 
f these samples provide an experimental 
= 40, and kn =m. But o2(n,k) can 
aX(kn), so that by adding the small term o3(kn), 
tal estimate of o2(n,k), we obtain an estimate 
orresponding variances determined by the 
mparisons are not independent, 
the variances obtained by the 


f the means 0 
i = 20 or k 


to the c 
Table 3. These co 
variances to 


5. SUMMARY 


dow (1944) for the variance of the mean of a systematic 


The fo p / 
tempi, qui Even by Nd eris was applied to a continuous population, and a 
Dethoq was di wa A ^ Aetormining, subject to certain assumptions, the variance of the 
Mean olas e ne [o sole from single samples. This was applied to the problem of 
“timating the ans of vegetative cover in ecological studies. 
I wish I us mer sim whittle and other members of the Applied Mathe- 
aticg loben race fon eae a «ons; to Mr R. D. Dick, Soil Conservation Research 
pe ory for pary ct hment Board, who raised this problem with the 
the North Canter ive records, gave much time in discussions and supplied 


a 
Orai : ig exte Nn P 
tory, made available his e ferees for bringing certain recent work to my notice. 


nks to 


de 1 
descriptive notes; also to the re 
REFERENCES 
Cog apy, of systematic and stratified random samples for a certai 
ERAN, W. G 4. Relative 9075 t. 17, 164-77. Xs 
; W. G. (1946). th. Statist., V» Ue PR 

Pu Class of populations. 47: zr iystematie sampling in timber surveys. Forestry, 22, 64-99. 
Fiy Y, D. J. (1948). Random ets periodic variation in forest sampling. Forestry, 23, 96-111 

x P imber surveys. J. Agric. Res. 57, 713-36. i 


NEY, D examp 

» D. J. (1950). AD 9. "or 

Tose A, A, dd SampliP! hoy 
Tr, G. E (1952). The 8° 
50- f systematic sampling. I. A: ] 

t 9. theory Of SY l pling. I. Ann. Math. Statist. 20, 333-54. 

Maw, W. G. (1949). OP o neory of systematic sampling. III. Ann. Math. Statist. 24, 101-6. 

Som W. G. (1953): 2 g. (1944): On the theory of systematic sampling. I. Ann. Math. Statist. 
Ww, W. G. & MADOW* 
15, 1-24. 


f systematic sampling from conveyor belts. Appl. Statist. 1 


10-2 


148 The variance of the mean of systematic samples 
OsBORNE, J. G. (1942). Sampling errors of sy: 
J. Amer. Statist. Ass. 37, 256-64. 


QUENOUILLE, M. H. (1949). Problems in plane sampling. Ann. Math. Statist. 20, 355-75. 
Yates, F. (1948). Systematic sampling. Phil. Trans. A, 241, 345-77. 


" 5 iS. 
stematic and random surveys of cover-type area 


APPENDIX: DESCRIPTIVE NOTES 


Transect 4, Leith H ill 


| 
Transect 7, Bridge Hill 

The top peg is at 3450 ft., and the line transect descends at an average angle of 33° for a distance of 
10 chains. The perennial herb Celmisia spectabilis with the tussock Festuca novae zcalandae and the 
prostrate much-branched shrub Gaultheria depressa, dominate the Physiognomie features in this 
portion of the tussock grasslands. 


T'ransect 8, Constitution, Hill 
The top peg is at 4000 ft., and the line transect descends at an av 
of 15 chains. Festuca novae zealandae and Poa colensoi are the two co 
two dozen native and introduced species of grasses and herbs grow in 
species, such as Pimelia prostrata, Cassinia fulvida, Discaria touma 
occur occasionally. 


erage angle of 33° for a distanca 
mmon tussock species, and Spon 
the inter-tussock spaces. Shrub i5 
tou and Leptospermum scoparit? 
Transect 11, Lower Lyndon 

The top peg is at 3550 ft. and the line transect descends at an aver. 
the tussock grasslands is characterized by shrubby growth comprising Cyathodes colen sot, Gaultheria | 
species, Discaria toumatou interspersed with Celmisia spectabilis, Festuca novae zealandae and rept 
sentatives of about the two dozen species of native and introduced grasses and herbs common in thes? 
grasslands. 


ion of 
age angle of 29°. This portion © 


[ 149 ] 


GROUPING METHODS IN THE FITTING OF POLYNOMIALS TO 
UNEQUALLY SPACED OBSERVATIONS 


By P. G. GUEST 
University of Sydney, Australia 


INTRODUCTION 
eries of observations y; at values x; of the inde- 
large, the observations atneighbouring 
t of N = n[r pairs of values yy; yi 


Whe: 
pen, na 9 polynomial is to be fitted to a s 
nt variable and the number of observations ” is 


Value 

S "^. AT x 
hese of x; are often grouped together to give a se 
quantities yyp tyi are the sums of the r observed values in the particular group. 


de E labour required to fit a polynomial curve increases very greatly as the number of 
in ations becomes larger, the time spent in fitting a save to the N grouped observations 
Wie Y à small fraction of that which would pe required if the original set of n observations 

Used. It is for this reason that grouping is often employed before fitting a curve to 


e 
wol ulus. 
is à l l 
95taj clear, however, that in general the grouped ¢ 
Cury, ned from the original set of observations. The r e 
es differently in the two cases, but that is a minor matter. What is more important is 
increase in standard error and 


th 

at . ion 
ad the grouping will give rise to & loss of information—an mene 
crease in efficiency—and may also give rise to bias in the estimates. It is the purpose 


Eus present paper to investigate the last two effects sie Mme by means of 
Solu " the suitability of the grouping method as an approx o the least-squares 
tion may be determined in any particular case. —— à 
in n an earlier paper (Guest, 1954) an account jigs been piva" of tho use of grouping methods 
the fitting of polynomials to equally spaced observations. Unfortunately; no ssach 
anera] a = be given in cases where the (pene are not spaced at equal 
ervals. The procedure ado pted here is to describe the lepiature from uniform spacing 
two parameters x, and Ks K2 measure len a s of the observations towards 
i end of the ran, ‘ of the independent variable v, while «4 is a measure of the crowding 
Wards the centre a the range rather than towards the ends. Aul account of the use of 
ese Parameters has been given iia pg i-e oie an i 
“A Short section has been included on the use i Lapeer aie fitting of polynomials, 
à full discussion of the cases vim ae T Ed bel first degree. Finally, 
E S'vaxicus methods of obtaining a fitte! he point of view of time 


e . 
u: as 
Wired, of efficiency, an bi 


urve will not be the same as the curve 
andom sampling errors will affect the 


isa 


of the estimates. 


pRESENTATION OF THE OBSERVATIONS 


RE 3 

When the values of the indepe" ent pu "i ao n oom of observation are arranged 
3 bserva identified by tk m : 

Order of aitude, each 0?' i 4 à y the number e giving its 

Position "s in vence, Et? ing integral e aes values from + 1(n— 1)to — 1(n — 1) 

"<x. (e) will be represent 2 s 

ts (5) presented by the symbols y, (c). In the present 


he observations at the PO i e . 
servations à f points x,(¢) will be replaced by a smoothed-out system X, (c) 
n 


Suse: 
Cussion the syste™ 


150 Fitting of polynomials to unequally spaced observations 


za spe ice 
obtained by fitting a curve of the third degree in e to the values v„(e). With a suitable choic 
of origin 


1 

X,(c) = kin Tinle) F bs, T,(€) Se kzn Ta (6). ( ) 
2 

where kin = YXu,(c) T, (6)] S; T3 (6). iz 


and T, (e) is the orthogonal polynomial of degree j in c. 
If the » observations are converted by grouping into N = n|[r values 
4-1) 
Ux) — Dy, (re+z) 


z=—}ł(r—1) 
at points Xy(€) = Y c, (re 4- z), 


the smoothed-out system for the grouped observations may be written as 


(3) 

Xy(e) = kiy Try (6) + kay Toy (6) + kasy TT (6), 

where from the equations connecting the polynomials T, (c), Tj. (c) (Guest, 1954) 
kiy x i —31(r = 1) kzn], 4) 
kay = Po ( 
Itsy = a 


It is convenient to remove a scale factor 


5) 
Pn ES (kin Tunes) l 
from equation (1), and to change to polynomials 
Tile) = nH (6). (n 
Then Xale) = O73, (6), 
8) 
where En(€) = ki, s, (6) + Kan Tan (6) + 2, T4, (6); ^ 
Kin = [2 Kon = Nnkons Kan = $179, kan, 
(10) 
and Kin = 1—$Kan- 


; of 
The advantage of this notation is that the coefficients Kj, are all of the same order 
magnitude, as also are the orthogonal polynomials r;,(c) (Guest, 19534), 
For the grouped observations X w(6) is replaced by 


Ev(€) = Kaya (6) + Ko Tav (6) + 2ks y T (c), (11) 
where Xy(e) = róz?Ey(c), (12) 
and Kw — T$, kin, Koy = Nr pakani Kay = 2N? kyy. a3) 
From equations (4), connecting f im jy, and equations (9) and (13), it ig iind nds 
Kay = Kan, Kan = Kans j a4) 
Kiy = Kin (N =n?) Kan. 


151 


TM P. G. GUEST 
Will be assum: i 
ed i —2 
the firm that n is so large that n~? can be neglected. Then £y(c) can be written in 
3 
£y(c) = X (e — N dy) ni (8); (15) 
Sera £l 
Cy = 1-3 du = iK» 
C19 = Ka: dj; = 0, (16) 
Cis = 2Ks dy, = 0 
ang zs . 13 3: 13 , 
ky is written for Kj 
i 
The fit EFFICIENCIES OF THE ESTIMATES 
t 2 s s s 
ed curve of degree p in v is usually expressed in the form of a power series, 
p : 
p(t) = X br” (17) 
Les M 
S Qi : 
ommonly, the curve may be expressed as à series of orthogonal polynomials, 
8) 


p 
u2) = XM a; Tx), 
j=0 
n the variable v with the leading 
the second form is much the more 
by the orthogonal property. The 
ticular, the coefficient of highest degree bpp is 


ideni; 
à rw with a,. It is the coefficients 4; which are obtained in the forward section of the 
3 oe Doolittle method for the solution of the normal equations. The concentration on 
Moai hogonal coefficients must not be taken to imply that the actual fitting is to be done 
i ee the orthogonal polynomials at the points of observation—in fact, the usual 
and à € based on power moments should be employed. The relation between the power series 
he orthogonal representation has been fully discussed by Guest (1950) and by Hayes 


Vickers (1951). 
© variance of a; is given by th 
varaj = vat yl: Tj) 
v 
replaced by the smoothed set X„(e). 
expressions for Tin(6) = nN, (c), it 
te equation of the form 


mial of degree j i 
tical discussions, 
brought about 


Whe ] 
Coe rei is the Tchebycheff polyno 
Sonvenic, poen .* However, for theore 
pj are Tin because of the simplification 
i near functions of the a;—in par 


e formula 
(19) 


tion used here, 


Th 
e 
Values v, are, in the representa 
d the standard 


SÌ 
^£ equations (7), (8) and (10), an 


n be Shown that Y/72(v;) is given by an approxima 
ST Har) e dc n sh P 
Where higher powers of n~? mn been neglected. Similarly, for the grouped set, 
zie weittf,(1— N93). (21) 


ET (r$ 
i 
: lynomials. Thi 
are different sots of polync s. The use of the sym 
als is convenient, but it is realized that it may be o 


* 
It should be noted that 7';(x) and T.) 
Al polynomials are here distinguished by the use of 


4 
fi E denote any set of orthogonal polynomi 
N o g in certain cases. The two sets of orthogon: 
rent symbols © and ¢ for the variables. 


152 Fitting of polynomials to unequally spaced observations 


In these formulae f; and g; are functions of Ka and K, but not of n or N. It will be assumed E 
the subsequent discussion that the number of observations is so large that the term in n7 
may be omitted. 


Since yy; is the sum of r observations Vis 


var yy; = r vary. (22) 


If the curve fitted to the V grouped observations is written as 


Upy(%) = Day Tj (x), enu 
7 
then VAr Qiy = Var Yy: E T3 yltyi) 
i 
and (vara;)|(vara;y) 2 rti-n(1 — N99). 


The efficiency 7(4;x) of the grouped estimate will then be given by 
Wa;y)=1—2 /73g;. (24) 
In practice the g; are too complicated to be represented by an explicit formula, but it ig 


fairly straightforward to calculate them for given values of Ky and x3, The expressions for 


?/(&;) are listed in Table 1 for polynomials of the first, second and third degrees, and for 
K$ = 0(0°5) 1-0, x = — 1-0(0-5) 1-0. 


From these expressions the efficiencies for any value of N can be determined. The large 


BIAS or THE GROUPED ESTIMATES 
If the polynomial is of the second or 
estimates. The origin of the bias is 


Ely) = Z bpi. 
Jj 
Hence E(yy;) = X Ely) = X05 Xia, 
T J T 
the suffix r attached to the summation sign indicating that the sum is taken over the 
particular group. Now Xab4g—H(ND x), 
t 
T T 


unless j is 0 or 1, and so Ba) +E (17902) odas. (25) 
j 


The true value of the coefficient for the grouped curve is 
bo = 1-355. 


Thus (25) becomes Elyy;) + Opin abu. (20) 
J 


153 


= £t 

$90:0 — 920-0 0 $960-0— 090-0— | 990-0 120-0 0 920-0— 090-0— | 690-0 620-0 0 160-0— I90-0 ae 
LI0-0— 900:0— 0 €00-0 200-0 ST0-0— 900-0— 0 Z000  I00-0 0 0 0 0 0 E. 
SIL0 660.0 — 910-0 #900  ££0-0 £800 690:0 £900 980-0 120-0 0 0 0 0 m a 
$90-0 380-0 0 I1£0-0— I190-.0— | $L0-0 80-0 0 $£0-0— £90:0— | 980:0 0F0-0 0 ¢80-0- €90-0 E. 
LOTO ISLO — OGL0 ZOTO 960-0 GLIO 931-0 960-0 6210-0 90:0 EPLO TOTO 3240-0 2190:0 90-0 na 
991-0  0$L0 660-0 PL0-0 Z200 LITO 160-0 690:0 190-0 80-0 0 0 0 0 0 a 
OL 9:0 0 G0-  01- OT €0 0 tr Ot OT 9-0 0 go~ wi ta 
oT 9:0 0 P 


“In ur smq oi fo uoynjnajmo ay, sof g soume) 'e oq, 


9:88 1-36 8-6 L-96 8:16 9-68 GEG 8-96 9:L6 €:86 9-06 c6 8:96 $86 T:66 IG6-N 
1:08 9-98 TIG EP66 €:96 6:18 €-88 


a 826 — 990 FLG | Les 000 976 946 686 | OT =A (NH) 
E jerurouAqod ooadop pag 
b 6-88 P26 8-6 v-96 ELG IU £86 L-86 TL6 0:86 0:16 THG 2-96 0:86 L86 SL—N 
O 08 9s LOG 946 LG | €78 — L88 É E dá 0.96 — v96 | L98 L68 886 996 8416 6 = N (Nn) 
. jerurouAqod. oo3dop pug 
S 
+ SIG Le OLG LG 096 | Lc6 ZPE VLG 196 €96 | Lze 846 0:96 9:96  L96 | G=N (Nk 
£4 qperurouÁqod oaaZop 4st 
OT 9-0 0 Wo GI- OT 2-0 0 qu cop OT ERU [U RID NOT *y 
oT e 0 ty 
N [o sonqa pojsobbns «of savua amung "c oae, 
"BN — 1 (fo) 
6-00 9E SZE LET 896 e9 108 FST 800 69 | OIF 996 OFL 69 189 E 
6ST OTT OGL 06-¢ £8-& SFL T96 96-9 80-7 88:6 6cI $88 00-¢ L8: I81 ds 
ZLE L&I SEI o 801 TOI | 861 PPI DI 860 — 6-0 | $81 — 6c1 — 001 — 980  c80 B 
Or 9-0 0 €0— OI-| OI 9-0 0 €0—  01-| OT 9-0 0 €0— oI- = 
i 0-I ¢-0 | 0 N 


Ny s2u2121/]200 ay} fo (X'n)k SIUNT `I AQEL 


154 Fitting of polynomials to unequally spaced observations 

Because of this inequality the estimates b, obtained from the ordinary normal equations 
X(s;—X bx) xS; =0 (27) 
i j 


will in fact be biased estimates. 

Again, itis more convenient (although the arithmetic is still very complicated) to calculate 
the bias in the orthogonal coefficients rather than in the power-series coefficients. The bias 
in ayy is defined as the difference of the expectation from the true value, 

Bay) — ajy. 


The bias turns out to be, to order N-?, of the form 
np 
N= È gy (d A NSy- as, Ge) 
js 


where the g;;, are functions of x, and x}. The true values ay are of course unknown, but an 
estimate of the bias can be obtained by substituting the calculated values a; y for ain in (28). 

From (11) and the standard expressions for Tiy(€), the range of £w(c) is (N 2 1) (kıy t tkan): 
So, from (12), px N7? is approximately the range of Xy(e) and hence also of gyr — ine 


difference between the greatest and least values. Thus the estimate of bias can be written 29 
p 
N- p gj y (Fla ty. (29) 
=0 


The ratio of bias to standard error will be more useful than the actual bias in deciding 
whether the grouping will be satisfactory. The standard error of a; is, from (21) 
(97 Nr?)-E Nfr Ms. yy). 


and so the ratio can be put in the form 


, 


Biasa,y . 1 
putei cgi ee , 
S.E. apy ' e EByajs (Ra yz, (30) 


where cy is the standard error of a grouped observation and the B, are functions of K2 
and «s. Bo, and By, vanish, and so for the third-degree polynomial adt fas Apne GEL 
a and az, while for the second-degree polynomial the bias depends os E As was tbe 
case with the g;, the explicit formulae for the B;, are excessively com: jicated but it is 
possible to obtain numerical values for selected x, and x}. Some of ins RNC listed 
in Table 3. 

It will often be necessary to ascertain before proceeding with the calculations whether 
the grouping will give rise to significant bias. In terms of the observations before groupings 
equation (30) becomes g 


Bst d nbi t 
DUASOEN SUN "EBaA,, Oy 


S.B.dpy T 
where c is the standard error of an observation and 4;, is Written for 
A;n can be estimated in the following way. If the five values of Yi 
one-quarter of the range of x, are denoted by y( +4), y(-- 1), y(0) 


5, (Ha x;)i. The value? 
Spaced at intervals © 
:9C- 3), y(— 4), then 
As, = 9[y( 4-3) - (7 3) - 29(0]. | 


As, =48{y( +4) - CH D) 2090-9 - 9 C- py, (a 


i 


P. G. GUEST 155 
hrough the five points. The estimate of 


These are actually the values for a curve passing t 
much less so. If the scatter of 


dm will usually be reasonably accurate, the estimate of As, 
€ observations is large an average value of y in the particular region should be taken. 


A can be estimated roughly from the scatter of the observations. Ka and k, can be estimated 
Tom the values of x corresponding to v (n—1), £3(2— 1), 0, by the formulae 


p 5m +h) aC 2) 7200). | 


alues of c of +3 


e er oC 
sal Ha -D- 2e CHO-8C D) | 
Kg (+p) 


(33) 


€i ; 
8 the number specifying the position m the sequence. 


ILLUSTRATIVE EXAMPLE 
f the mechanical equivalent of heat as a function 
1921). Suitable constants have been subtracted 


agnitudes. A rough check on the order of 


In : d 
o Eus 4is listed a set of 67 observations © 
3 mperature (Jaeger & von Steinwehr, 19- 
S he original observations to reduce their m 
lases for N = 17 will first be made. ! : 
he range of a, is from 4-30 to — 15. Hence values of y in the neighbourhood of w= +30, 
P19, 7 . 4 M 15 ee : required for substitution in formulae (32). Average values in these 
Tepi $e m are r! 
Blons are of the order of i - 
y+) = 110, (+4) = 8% y(0) = 50, 797 iio wat 2204, 
and so As, = +600, Asn? 
These : but will suffice for the estimation of the biases. 
js Values are of course que he scatter of the observations, is about 30. Since 
there deed — = Wa alues of Yi for the observations numbered 1, 174, 34, 503, 67 
observations, the V 


àre required for the estimation of ka and Ks from formulae (33)- These values are: 
or the esti A 


rough, 
ed from t 


PRODI tem c8 stop 069 
t(4-1) = 29-6, o(-+4) = 10 i a. c 50-6 
and so azot um 
He 
nce, from Table 3, Bow 0104, Bas 0-00, 
cS. “ee 
es ota ear 009 Paste 0 
31 


Fr 
9m equation (31) 

The Jie moin ts will be of the same order, and so the bias with N — 17 
he ratios for the other co observed that this extremely small bias in the estimates occurs 
ate negligible. Tt v ore pposite sign, but even if they were of the same sign the ratio 

use A,, and Aan 

Wo ae .10. : 
ua still only be 0 Eus shown in Table 4(0). The central group contains only three 

he grouped value ms for this group are increased by 4/3. The calculations using the 


: e : pa 
Servations, and 5° o been carried out both for the original set of observations and for the 


Solittle scheme b? 


67 1 (0.07 x 600— 0-12 x 550) = — 0-02. 


156 Fitting of polynomials to unequally spaced observations 


grouped set. Table 4(c) gives the calculated values of =Tj(x) and the efficiencies of the 
grouped estimates. These are compared with the efficiencies as predieted by Table 1 for 
K$ = 0-3, k, = 0-6. The values given by Table 1 are quite close to the true values—in fact, 
closer than might have been expected considering the inaccuracy of the estimate of Ks; 
Table 4 (d) gives the orthogonal coefficients, and the power-series coefficients for the third- 
degree curves. The biases are seen to be completely negligible, as was predicted. There will, 
of course, be random differences between the two sets of coefficients, and these completely 
Swamp any bias effects in this case. 


Table 4. Illustrative example 


(a) Original observations (n = 67) 


z y = y | r | y | v y 
| | — — 
p^ Em "PG di " i E d p N 
+ 29-60 HER | qaibas | 8 | =0o | xi | -qm 101 
+ 28-36 102 +1060 | 57 | -025 106 — 7-40 165 
+ 26-96 1080 | +919 | sg =117 104 = mal 169 
+25-79 74 | + 915 | 79 | ~148 117 - 7-94 187 
+25-56 25 + 7:76 | 66 Ars 104 | $39 153 
+ 24-34 103 + 6-33 41 — 1-58 97 — 8-55 175 
+23-09 104 +624 | 7 —1-58 104 — 867 202 
+1941 66 + 5-58 50 — 2-59 133 |  — 998 176 
| 

417-19 35 + 536 70 =267 | 90 — 11-05 207 
+ 16-64 i | owes | Bn -361 | 437 11-19 200 
+15-79 8 | +411 | 6g -369 | 130 | 1149 230 
+15:75 5 | $59 | € f -em | Nus | —m ane 

| | | 
414-39 69 + 3-24 71 | =B 114 — 13-62 269 
+ 14-32 113 +253 | 80 | 600 201 — 13-85 245 
13-98 45 + 1-80 | 83 | —603 114 —14-35 315 
+12-54 68 + 1-41 80 | —697 137 — 15-25 291 

+ 116 89 | 

+ 113 96 
+ 0-01 99 | 
k 


+92:40 298 + 25-88 228 = T20 438 
+65:37 376 4-18-25 272 = ets 475 
+ 55-23 295 + 8-98 314 — 24-12 566 


x y a y m y w y j 
+110-71 387 4-40-43 282 — 265 446 — 29-88 622 


P. G. GUEST 157 


Table 4 (cont.) 
(c) Efficiencies a (ax) 


pre | Wi | SUE 2 
| YT; | XB | y= Tiy Tin 7 (Table 1) 
| e m ls "UM E | | 
| | | 
[D à iE 36861 | 0-9946 | 1—146/289 = 0:9950 
| 3-66 x 10! 8922 x 10! 0-970 1— 99/289 = 0-966 
| 1-913 x 105 1780 x 10° 0-909 1—31-1/289 = 0-892 


(d) Orthogonal coefficients 4 and 


power-series coefficients b 


j 


jn 
0 xm - 
l — 
2 — 3-59 + 0-25 
3 +0-261 + 0-020 


—0-0074 + 0-0018 


| AJN ran | Dain [^ ribs 
—|— a ane | — 
— — 92-4444 372.18 93-0 
| —359 —3:59 | — 5:57 + 0-48 | — 5:60 — 5-60 
| +0-0661 40.264 | +.0-407 + 0-040 +0°1002 4-0:401 
—0-000451 | — -0072 | bss = as 
| 


The 
Standard error of an observation may 
(Eot(n p 


eu 
en irs cie and grouped 
uals can be found from 


one 
Eu h 


M in th 
© grouped i 
®parture ped case 18 


WEIGHTIN! 


O 
en it is not possible to find 


of 
EN range of t; and i 
the plied by r/(r- 1) to 
j actor is r[(r — 1). 
Ssu ld 
u groups shou Ei 
Ms of the powers. This: 


ur 
pc and the calculations 
; ission of the weights doe : 
A detailed m Ke : ; 
rder of 1 —? 2N-1, which will almost always be negligible 
gible. 


factor of the 0 i : 
aded that the grouped observations be left unweighted. 


t E 
i ied efficiency. 
a Cieney by 2 
Nee it is recomme 


u the 
present example the estim: 


from this ratio is 8l 


Wh 

e M 

te vis small. The v addition 
n form 
pring them to 


mated from the usual formulae 


_ jb and {iN -P Di 
Tn both cases the sum of the squares of 


be esti 


cases respectively- 
the formula 

Se? = Ly? ya} ET}. 
f g in the ungrouped case is 24, and the estimate of 


ate O E : 
the ratio 1:2. It is not clear whether the 


These should be in 


i gnificant. 


37. 


G OF GROUPED OBSERVATIONS 


ple pair of values N, r, such that n = Nr. More usually 
n= NrVU, (34) 
al observations should be included in v groups near the centre 


ing these groups the sums of the x; and y; values must be 
the same scale as the other groups. If v is negative 


a suita 


e given à weight (r + 1) [r in forming the moments and th 

reatly increases the time required to evaluate the: à 
e much more rapidly if the weights are omitted ithe 
o any increase in the bias, but it does modia A 
too long to be reproduced here, indicates a EX 


can 


s not Jead t 


tion, 


158 Fitting of polynomials to wnequally spaced. observations 


. s in 
If the original set of Observations y; have different weights w;, then Nw, replaces E 
formula (34), and the Observation of weight w; is just regarded as equivalent to w; e. 
tions of unit weight. The observations are divided into N groups each having the same 2j 


STEP-FUNCTION METHODS à 
5 ; ho! 
For the first-degree polynomial the use of step functions provides the most rapid met 
of determining the line which fits the observations. 
Curves to be fitted will be of the first degree, a full dis 
these cases will now be presented. 


». he 
Since in practice the majority of d 
cussion of the use of step function 


If the scale factor is removed, the independent variable may be represented as 


35) 
£(c) = k,e4- Konie? — An?) + 2kyn-*(c3 — 326), ( 
terms of the order of 5-2 being neglected. Since K, is I — 1x4, this can be regrouped as 
36) 
&(e) = e(1 — Ba) + 2k,n-*8 + ic n-W (et dn), | 


If w,(c) is any function of e for which Xw;(c) = 0, then 


b, = Ewy(¢) y(c)/Ew,(c) &(c) 


: n : : e 
will provide an estimate of the slope of the Straight line which fits the observations: Th 
standard error of this estimate is given by 


(otb) = {Zwe £G)Jef[xuatoy] oa 
For a step function, Zw(e) f(e) is of the form 


10-1) Ma,—1) 1(4—1)  Ma—1) 145-,—1) klam- 
[a( SP M )+a( a — S ) +a 3» v D 


38) 
04 0,4 04 6,4 mo à J| ue - i-a ( 


the numbers 3(a;— 1) being the values of c at the ends of the ste 


he 
ps. Neglecting terms of t 
order of n-2, 


Ka-1) 


e (e — (—e)} = {itj 1) 27} gis. j odd, (39) 


=0, jeven, 
where na = a. Hence, from (38), 


m 
; ; Fe. ) 0 
Ew(c) ) = {nitt (j+ 1) 2 D brea(od, — ad. 4) (40) 
when j is odd, and m 
Xwi(e) = nM m1 (i — ens: (H) 


For the equally spaced case it has been shown (Guest, 1954) that the estimate b; 18 “4 

smallest standard error when the steps are of equal size and the weights are m,m-l en 

Using then the values PESE TIPS +1)) (2m 4- 1), 

in (40), Ew, (e) e = (2n?/(2m + 19) xj. (42) 
Ev,(e)e? = (n']29m-- 1 QU asa, (m 

while from (41) Xwi(c) = (2n[(2m 4-1)) Xj, (44) 


| 


—€— qu ~ 


P. G. GUEST 159 


Substitui: TE 
bstituting these values in (37) and inserting the standard expressions for Xk’, it is found 


that 
o*(y)]o*(b,) = [yn] [1 — 2m 17] D —Akg{1 + m 1) 


whi š : 
hile for the least-squares estimate bj =a}, (19) and (20) give 
etie = vf 


The ici 
explicit expression for f, is 
1f, = 1- des t de + THM 
is given by 


TI 
Yerefore the efficiency 7(b,) of the estimate b, 
14+N~)P/12/, 


gb) = (1— N39 Does (48) 


Wher 
De N = 2m 4 1 is the number of groups. Ta Ma 
m e function 7(b,) is tabulated in Table 5 for values of N equal to 3, 5, 7, 00. The efficiencies 
fa en N = 3 are perhaps a little low, but the value N = 5, corresponding to double-step 
Neti f f 
Ctions, gives satisfactory efficiencies. 
first-degree step-function methods 


Table 5. Percentage efficiencies in fi 
A 1.0 
0-5 


kt 

2 0 " 

Ka | IET. e:01 20:5 (0). 0*8. 71:0 

ums - T ay | 1:0: 0D 0:5 

SOM E US da AT cape e MONE ey NS 187 Me 

mio ol nea S50. Bon 910 938 929 909 872 91-9 9L3 90-0 875 832 

Nc 9T5 98.0 98.0 "s a 95-3 954 948 992 quo |) ESO CERE Ud 

XT .5 980 98:0 97:0 944| 9 5:4 9*5 06 927| 945 945 93-8 91-9 88- 
œ| 988 99.6100 995 974| 966 979 TIO PO a 2 MEA 


1952, 19535) of the use of single- 
olynomials. Step-function methods are rapid 
s are rather low if the departure from uniform 


ng oF 

Blve i : tthee : 
Pacing ee estimates, Qu investigation into the use of double-step functions has 
i at all pronounce the improvement in efficiency is only slight for 


Le x $ rg that 
Eur carried out, but EAD es of the earlier method is that different 
functions E e um i second a d third-degree curves. Tt ys since been found that 
he Steps in tl zt ;rd-degree equation are weighted in the ratio 2:1, the same functions 
can be he third-deg third-degree curves without any great drop in efficiency. 
used for the second- an by the expressions below: 


wo earlier papers (Guest, 


her degree P 
fficiencie: 


A " 
ste TEN has been given in t 
P functions in the fitting of hig 


$ iven 
ne Tecommended step functions are now £g 
S4 
3 n 
Zero degree: Es : 
Enaut Anu En; 


First degree: Sines Xni 
Second degree: Ene, + Aaz z 
> p ` , D os 
pe 2Xng + 2ts2— Engs — Eng; + Ens; + 23n3, — 2n. 


Third degree: : 
bered from 1 to n, and Enj; signifies the sum for all the 


he d num 
ob " Ose: x a 
Ser PAEST on ae The values jx for the various step functions are the integers 
Vati 1 to 5r 2 
tions from pown below: 


Carest, s 
to the numbers 
nai 33N; Na: Olln; 919: 0:25n; 


e: Q-07n; Ngo: O-15m; mag: 044m; 


1 
Nip = N — Nijp 


160 Fitting of polynomials to unequally spaced observations 


CONCLUSION 


To obtain an idea of the times required for the various methods, the calculations Me 
performed by each method for the example given in Table 4. Table 6 gives a summary of E 
times required to fit polynomials of different degrees (including the checking of the wr. 
tions) by the various methods, together with the efficiencies and the estimates o od 
coefficients a,,. It will be seen that for the first-degree polynomial the step-function € dd 
is by far the most rapid one. For the second-degree polynomial the step function anc H. 
least-squares grouped methods require about the same time, while for the third-deg 

polynomial the least-squares grouped method is the most rapid. 


Table 6. Results for selected exam ple (67 observations) 


: --— | 
| | 
Ist Degree | 2nd Degree 3rd Degree 
g 
| | 
| 
| 


Time | Effie. | Time | Effic. 


| 2) | Time | Effie. | q, (x10) 
| (min-) | ma) | (min) | alae) | 0 | iri | a) 9s 


| m 

| | PRENNE 

T bu. 4 | | , | T4318 
east-squares | 98 1-000 158 | 1-000 | 26:14 2.3 246 | 1.000 | — 

Least-squares grouped | 29 | 0.936 61 | 0-925 | 25-8 100 | 0-909 | —72 

Single-step functions 16 0-825 60 | 0-831 | 28-4 128 | 0-725 | — 9 

Double-step functions | 17 | 0-898 | 67 | 0.835 28-9 148 | 0-763 | -T1 


The general conclusion to be drawn 
its disadvantages. The full least-squ. 
à very long time, especially if the c 
Squares grouped method is rapid a: 
estimates. The single-step function 
bias, but the standard errors canno 


d has 
from the work described here is that each nthe 


east 
a the 
s no 


ares method is efficient and without bias, but "e 
omputor is not familiar with the technique. Tha 
nd the efficiency is high, but there may be pias - i 
method is rapid and of fair efficiency, and there 
t be estimated in any simple way. EE 
Perhaps the following statements will provide a satisfactory guide in the selection We" 
appropriate method for the fitting of a second- or third-degree polynomial. If n 
is large the least-squares grouped method should be used, since the bias is not then lik $ 
be of importance. Cases will also arise in which it is possible to remove the greater P i S. 
the variation in the dependent variable by the use of approximate values of the coeffici jng 
Tn particular, if the approximate contributions ba? + bg? are subtracted before ge 
the corresponding least-squares coefficients obtained from the grouped values will be a MET 
that the bias due to grouping will be negligible. The efficiency is unaffected by this e may 
If this procedure is not convenient and the scatter is small the step-function metho tive 
be used, but if ac curate estimates of standard error are also required there is no altern® 
but to fit a least-squares curve to the full set of observations. 


f the 


REFERENCES 
il. M 41, 124. 
Gvrsr, P. G. (1950). Phil. Mag. [T], 41, 
Guzsr, P. G. (1952). Aust. J. Sci. Hi m 238. 
Guzsr, P. Œ. (1953a). Aust. J. Phys- Hes 
Guzsr, P. G. (19535). Aust. noni i 3 à 
Guzsr, P. Q. 54). Biometrika, 41, 62. 
Havzs E " demas T. (1951). Phil. Mag. (7], 42, 1387, 
Jarcrm, W. & vox STEINWEHR, H. (1921). Ann. Phys., Zg2. 64) 305; 


[ 161 ] 


ON THE JOINT DISTRIBUTION OF THE CIRCULAR SERIAL 
CORRELATION COEFFICIENTS 


Bx G. S. WATSONT 
The Australian National University, Canberra, AOT: 
]. INTRODUCTION 
hod (1942) to the problem of finding the exact 
fficients in the null case. Madow's (1945) device 
function. For odd sample sizes, he gave an 
Since the direct smoothing of this function 
oximation to the exact density and 


u : 
rs dme dca) applied Koopmans's met 
Was a niio of the serial correlation coe 
explicit fo o derive the non-null joint density 
à rmula for the exact density function. 


1S ve : 
Sup w difficult, Quenouille conjectured a smooth appr 
Ported his conjecture by various arguments about the expected form of the result. 


Jenki 
nkins (195 4), following Dixon’s (1944) method, has found a smoothed density for the first 
hich, unlike Quenouille's density, has the correct 


WO seri 
A erial correlation coefficients W. ; 
E ents up to order n. These smoothed densities were derived for the case of no mean 
qeu. 
n $85 
^mi. 2 of the present pap 
ions for the joint density of 


null 
cor and non-null densities is easily derived. In : E 
Correlations is obtained for arbitrary sample sizes, and a fuller discussion is given of the 


interact: à à à 
nu €resting summation rule involved. In $4 an attemp t is made to derive an approximate 
"ll density when mean corrections are made by a new smoothing method. The form found 


iffers only slightly from Quenouille's and is subject to the same weaknesses. 


od is generalized to give integral 
elations. From these the relationship of the 
$3 the exact density of the circular serial 


er, von Neumann's (1941) meth 


the serial corr 


2, SOME GENERAL RESULTS 


he joint probability density function of 


x'Aj (;21] 
es xx (j arali (1) 


T P 
e basic problem is to find tl 


mn vector X are independent normal variates with zero mean 


Whey 

here the elements of the colur i 

and unit variance. It does not seem possible to make any progress unless the A; form a 
5 applications this means that serial correlations 


Com Jn the present 
mut trices. ^P. 2 : à 
ative set of ma t be considered. With this assumption the problem may be 


Wi 

n ith a modified definition mus rof 
*duced to that of the joint distribution © 

Apu, FAP + BAM. uus 
D : mo (3-1 sg) " 


Tj7 7 Wy tat T 
fM M orted in this paper was done while the author was a Re T i 
Daun xr of the wor! : Tole nomics; University of Cambridge, and formed part of "i cn eee tud ihe 
ethene ah be AM ni versity of : orth Carolina, Mimeograph Series no. 49. anri doe 
: eee i n nsn n = ii € eam joint density, with and without 
; n Ly not compar: i J à 
> q parable with n. [See also the paper 


Can corrections, fF m t : i 1 
by o. RU poe (1956): These two papers are printed on pp. 169 and 186 below. Ep.] 


1r 


Biom. 43 


162 Joint distribution of the circular serial correlation coefficients 


where A), AY, ..., A9 are the distinct latent roots of A; with multiplicities po, Pi» .--: Pm and 
Wo, Wi, ..., Wm are independent gamma variables of orders pp. Py. ..., Pm- The multiplicities of 
the roots of A; associated with any latent vector must be independent of j to obtain the 
reduction (2). 


Writing r, — ulo (j = 1, ...,q), Pitman's theorem (1937) gives immediately 


_ E(ubuleule ...) (3) 
a Even ye), 


Since vis independent of the set of values T3; To, s., 74. Writing f(r,, ..., r,) for the joint density 


(rie riers. 


function of r,, <- "q and g(v) = vP-te-"IT(P) with P = » Pi the joint moment generating 
i-o 
function of u,, ts, ..., tg is easily seen to be given by 


E(exp (ty ty + tu, +... + t,Uq)} = E(exp [ort 4- rats +... +7rqtq)}} 


= f- fan Lr drf dv Eriti gv) f(T erg) 
Da à 


Ble fecti) adr e 
(- 3 n) 
D, i-1 


where D, is the domain of joint variation of r, 7? --., 75, defined below. But the left-hand 
side of (4) may be evaluated directly to give the integral equation for fe Ta) 
hti sizes 


m 
-AOp nc Frer, 5 
Ia ^ qe ey? -f ee sd v 
D; t'i 


This is a generalization of von Neumann’s (1941) integral e 
of a single ratio. Using the results of von Neumann & 
seen to be the least convex polyhedron enclosing the 
the q-space of 71,79, ..., fg 

The integral equation (5) provides a simple method 
Tis o ...,7, When the x vector has a multivariate norma] 


quation for the density functi 
Morgenstern (1947), the domain Dj 
Points (AM, ..., A) (i = 0,1,....™) iP 


of deriving the joint density od 
density function proportional to 


exp(—1x'(1—9,A,— 7 A)x), (0) 
where ô}, ..., 9, are such that the density is non-singular, Since 
a y _ _ Et6X'A;x tx —X6;X'A,X) — Zt a 
AuxAx- xx- Y6x'Ajx X—20;X AjX) = i-es x(r- $ 3A) z (7) 


where the first factor of (7) is a homogeneous function of degree zero in gamma variables and 
where the second factor of (7) is the sum of these Vàriateg. 4 gamma-P variate. Tb® 
argument leading to (5) now gives 


m AQ Lus FAP Ja e v ) dp 
Bi p Jehan. 5 
D 35) 


—— 


G. S. WATSON 163 


n integral equation for the non-null joint density of ry 7g. BY substitution in (8), it is 


easily seen that 

m á 

fares 7) = i" 0-2 aapi (1 — X851)? f(rys tah (9) 
i- j 

p. Fry... 7,) is the solution of (5); the null density. A similar, but incorrect, extension 
(9) adow’s (1945) result to the joint distribution has been noted by Quenouille (1949). 
ai shows that the distributions of the multiple serial correlation 75.382...) and the 
rtial serial correlation —-—— independent of à, ...,À,- This fact for q = 1 has 
een noted by Dixon (194 4) for the means and variances of these statistics and by Jenkins 


(19 
n n the distributions. 
joi e integral equations may also be used to demonstrate the ultimate normality of the 
i distribution when the sample size tends to infinity. Bya different method, Hsu (1946) 
examined this question for individual ratios. 
3. THE EXACT DENSITY FUNCTION 


Fro 
m $2. it is sufficient to consider the null case. Then 


» (2; — 9) (ti?) 

M a (10) 
X(u-9* 

i=l 

are independent stand 


y= 


ard normal variables. The 


whi 
ere 
yi; = v; and where Ty, a eo EN 


Canon; 
nical form of (10) is e j 
E E * ( 
10’) 


al distributions of the rj. For our purposes, we 


eos For any fixed j, 


An : 
Merson (1942) has given the exact margin 
f the latent roots COS, 


Wil 3 Len 
| require only to exhibit the distribution O 
irs with the possible exception of 


2mj , : 
a us = cos NE so that the roots are equal in p 
the Toot N oot. Thus the following formulation will be seen to include 


; minus one, the least T 


Our 
Problem S 
. Suppose that 
pr pwt Ages te tAn, 
t= py a 
= appar tee o (11) 
eects > 
Ty ae wo wit Un 


ples, the first n variables being gamma-1 


amma varia 
are all greater than À, and 4, ..., 4, all greater than 


pz 


Whi 
E Wis., Wp and W are independent g 
the last gamma-P,f where Aj; -++ An 


if N is odd evidently p +n —iN. 
i , 


T If N is even, p 73; 


164 Joint distribution of the circular serial correlation coefficients 


“and 7,,...,7,, all greater than 7. (The problem imposes more restrictions, to be stated later, 
on the coefficients of the w’s.) It is then sufficient to find the joint distribution of 


n 
; 
X Aw; 


re _ Erw, t (12) 


s =r =fr T= . 
ay ty rU w+ilw, v 


~ webu, wv 
The region of joint variation of S1: S2 ..., Sg is the least convex polyhedron enclosing e 
points (0,0, ..., 0), (A5, ui, ..., 71), ..., (Aj: Hn: -.., 7,), & polyhedron in the positive quadran 
of the g-space of Si 89, ..., $4. It is convenient to drop the primes for the moment. 

The joint characteristic function of l,m, ...,t and v is 


n ; gom 108 
$9(0,. 05, ...,0,,0,,) = II (1 —i0,A;— 10,405 — ...—i0,7,—i0,,) (17 id, 4). U ) 
i 


; we 
The right-hand side of (13) admits a partial fraction expansion (Watson, 1951) so that W 
have, for q <n, 


ETE, 14 
P llys +225 0,5) = Bed z ihig ——— AU ) 
Asc... * eu . Fi " x 
* (1 — i0, yn-ato Jl, 07 95,614, 0, — ... — 10,7, — 10,4) 
Au fy, Tya 
Aj, Hj, Tj, 
where hitaat = Aig Hig Tal (15) 
3 MG 7; | 
1 h Bh Ti 
Jis jo : H : : | 
l An Pip was Tj, 


provided the A's, j/s, ...,7’s are such that all the 


C's are finite, TI ] term on the 
right-hand side of (14) may be recognized as the j S gonere, 


oint characteristic function of 
, 

= Àj, Wj, + Aj Wia eset Aw; 
M = hj, Wj, s, +... Hw, 


f Tj, Wj, T T4, Wj +... YT, 
v =w +w d... wb, 


n: 


(16) 


where w’ is gamma n +p — q and is independent of Wy Wess, Wip The joint density of these 
variables may be found ab initio to be (writing sign of q — sgn æ so that |æ | = esgn2) 
À 


d; e Th , 
sgnj| : : 1 Vi mwh tly non-a-i 
1 Ag e Tig emaema T Ay Bh cune T (17) 
A4 o ap Tin +p-@)| 7 : n . 
: : i As, Lo m iy 
A q 
dq T$ 


— ~ 


G. S. WATSON 165 
Thus, applying the Fourier inversion operator to (14), using (17) and integrating v out 
from 0 to oo. the joint density of 51.5». -++ Sa is found to be 


ntp-q-l 
Sq 


li & % 
| » Ay e Ta | 
f Ag Ji 9 5 sgn| : AE 
d. Bos : Me wea. i 
a ja jq 
(i4 p) "A poA d Tig à (18) 
T — 2 e— ; : 
(n--p—q)g, ies | p DA xe Ty 
Aj + Th À j 
e i 
A A j*hse34| : t 
Ay c L e By 


since the J. acobian of l, m t, v to 81; 82» v1. The terms in the sum (18) depend on 
the region of the Saree an), ie. the term ( pum belongs to EU ...,8,), provided 
Sb Sg n, 5, are possible salues of l'[v. [v as defined in (16). This means that s,. .... Sy 
"ust fall in the least convex polyhedron with vertices (0,0, ....0).. Qu, fir eg Tg) 


ins [ls : : 
ip lig ss T;). In this region 


wae $5 71$ 


1 8 5$ S 
1 Àj Hae d Th 
|l Ay M 2 20 (19) 
1 0 0 es 0 
1 Aj, Ma Th 
1 And o Tig 


sum is associated with a hyperplane 


in the 
eneral term 1n is as 
g and is, in fact, proportional to the 


As 
i Quenouille has pointed out the Quid Tj.) 
T : 1) eo Upl e A 
hae the points (Ay, /Gv sti ue Es sa) from this plane. 
p Ux 1 power of the dis Z id its notation, we see that fruta ng) 18 
, Veturning now to the orig 


viy, 
Siven by p, pira LA X T 
(3, BR A vee 
| i 2 BA CU Th sgn 1 Aj fy, Tj, | 
| 1 : H t 
r 1 3 Hj E. the Aj, Fi r (20) 
(erg). e Some d du |l Aj n «wi 
(n&-p—q)gee Jaen |! | Y X à 
T rg A 7t | Il A Hn ea 
: : | Gi oja | : 3 | 
: à | | i "d 
1 44 fg cT Tig ! E A, Hg 5 T. 


Whe, c all (ji 777 ja) such that (ry. -+> 7,) is contained in the least convex poly- 

Ere the set R p n" GR jp Hiv ses) sess Qus Migr ee T jg) In this region, the term in 

(20) on with vor power pis positive by (19). This is the required distribution. It agrees with 
raised to t^! 


166 Joint distribution of the circular serial correlation coefficients 


Quenouille's distribution when P = 0; this is the case when N is odd. When XN is even; 
P = 4. Furthermore, when q = 1, it reduces to Anderson’s (1942) distribution for all N. 
Quenouille has noted that there are several possible expansions for f(r}, ta ..., ),) when 


N is odd, i.e. when p = 0, A = 0, #=0,...,7 = 0. In this case the joint density of ry, ro ++! 
is given by 


L n % "y ee 
| Lh AR uu T. | As ^h 
A Ph j sgn| : 
m A : | Big, as Ti 
EIN peeps 5 AR Hig us T is : (21) 
Tín —9) oZ en |i A Hypo) Tj : 
lo, a Tj 
Puede io oio oi 
1 Àj Mig T, 


with the summation rule: (Jy; <Ja) € R(r,, 73) provided (ries r,) falls in the coni 
closure of (0, EETRI Qu Hj eS Tja) sss Qs Migs "5754). The domain of the variable 
Ty To, -.., T, 18, of course, the convex closure of (Ais His esT) ü= 1.. n). Suppose now mg 
the origin is moved so that we have new variables t= +h; (j z 1 q) with 

a) = Leia 


Ma Apthy ng fx Ti thy. 


The joint density of the 7j is simply (21) with a; 


à ^s and 
Sterisks on the 7’s. Xs. w ., and r's aD. 
the summation rule is found similarly. But this eie i c 


density can be rewritten as 


l n To. f; [i72 


* 
Ag Eur oe T; AT ses Tj. 
as : Sgn} : : 
4 D * `k 
T(n) x Y Ay B Ti Mo TH (22) 
T(n— doen. 
(n—4)u ien 1 £A; n 7j 
II 1 As, Hy, Tj 
Jis dq : = 


l A, Hy, inc T. 
which is the same as (21) except for the sign term and the Summation rule, But the value’ 
of the two densities are the same so that, by equating them, identities ay à btained. More 
usefully, however, it may be observed that this is equivalent to using an M M origin t9 
obtain the sign term and the summation rule. Hence, to calculate (21) as , a A the 
origin so that the point (r,, rs, ...,7;) is included in the least number m ee with 


vertices, the origin and Ajis gs e Th)» Ott, 775) and so use the shortest possible 
summation. 


4. AN APPROXIMATE DENSITY FUNCTION 


In §3, the expression (21) gives the joint density ofr, Tarte, Where + "n r_are defined 
by (10) in a sample of size N = 2n--1. The restriction is g<n or PAN whereas 
Quenouille’s argument requires 2g 4-3 « N. If, then, q is taken ag, | the Mani is seen 


SK 


G. S. WATSON i 
T 


to be inde H : : 
pendent o he values ot 7. ?. r. : in fact, itis a constant. T [his re. 
f th 2. ea 
lu f J Po eet Pn- » i sult leads 


toani 
nterestin 
i sn de arre of approximating the true density function 
polyhedral region of joint variation of 7,72 " is replaced 
fay cou ned placed by the 


(convex 
X i i 
) region in which the matrix 
T; 1 Ti 9 Ta 
(23) 


; PTT D 
is positi n-1 n-2 n-3 
ive-defini z 
Sa that the wes It may be verified that this region includes the true polyhedral regi 
Ntegrated ove add of the latter lie on its boundary. The variables 7; al region 
r this region, using à formula from Quenouille, nu +++ far may be 


| Fp | ! T(s4- D a (Lol f? 
» [Bab \ pe oe 
to obt fhean & Te" ( eile (24) 

ai 

in the approximate density 

a -1 T(s+2) | Ki] n-g-1 
J(ry fo 0M =m% 2 3 ( ^ ) 
re 2) B T(s) P . (25) 


t density of ry Tis m o Tin 29 n It is easily seen 


An equi 
be i marrón method is to find the join 
proportional to 
a —r?n-1.28...) a = p-2.23..) muU (26) 
to have the range (—1,1), they are seen to be 


ift 
hen A NE. 
2» ci" 1n, 28... 
(25) is obtainable from (26). 


indep 
ae and the density 
nota (25) applies, when N = 2n+ 1, for serial correlations corrected for the m 
Corr, s conjectured densit; function, for arbitrary N i pt Cans 
tected for the mean, is 2 y M sanc. correlations un- 


re all assumed 


wit ti rax [Fal ye 

Expre : s-1 ($ $2 sal (27) 
ones (25) and (27) do not differ by unity in the value of N as might have bee: 

replaced a NOU (27) has been criticized by Watson (1951), and Jenkins (1954) T 

ess ig RAT or q = 2, by a new expression which gives correct moments up to order n. F E 

n about the case when mean corrections are made. However, Jenkins's e 

incorrect moments and is elie 


or 
q= À 
Unae 2 leads us to the conclusion that (25) gives 
ceptable. } 

f Itisofi 

le ae bed interest to note thi may be deduced directly, by elementary 
i Of pew’ of the density of the gaps: made by n-l random points on the unit ay maces Fam 
nadequate se, comparison of 25) or (26) with Daniels's results show explicit tien 
qz AN. 2 His results apply only when 7 is n respect to N. Our sed where they are 
) and may still have some merit when q i$ s are exact when 


at this result r 


168 Joint distribution of the circular serial correlation coefficients 


REFERENCES 


ANDERSON, R. L. (1942). Distribution of the serial correlation coefficient. Ann. Math. Statist. 13, 1-13. 

Daxrzrs, H. E. (1956). The approximate distribution of serial correlation coefficients. Biometrika, 
43, 169-85. -— 

Drxow, W. J. (1944). Further contributions to the problem of serial correlation. Ann. Math. Statist. 
15, 119-44. . . 

Hsv, P. L. (1946). On the asymptotic distribution of certain statistics used in testing the independence 
between successive observations from a normal population. Ann. Math. Statist. 17, 350-4. 
Jenkins, G. M. (1954). Tests of hypotheses in the linear autoregressive model. I. Biometrika, 41, 

405-19. . ] "v 
Jenkins, G. M. (1956). Tests of hypotheses in the linear autoregressive model. II. Biometrika, 49: 
186-99. 2 
Koopmans, T. (1942). Serial correlation and quadratic forms in normal variables. Ann. Math. Statist. 
13, 14-33. . ! ; EN 
Mapow, W. G. (1945). Note on the distribution of the serial correlation coefficient. Ann. 
Statist. 16, 308-10. ; iy Soe: 
Prrman, E. J. G. (1937). The ‘closest’ estimates of statistical parameters. Proc. Camb. Phil. © 
33, 212-22. m 
QUENOUILLE, M. H. (1949). The joint distribution of serial correlation coefficients. Ann. Math. Statist. 
20, 561-71. 

VON NEUMANN, J. (1941). Distribution of the ratio of 
variance. Ann. Math. Statist. 12, 367-95. 
von NEUMANN, J. & MORGENSTERN, O. (1947). T 

University Press. 
Watson, G. S. (1951). Serial correlation in regi 
of Statistics, University of North Carolina 


; iffer he 

the mean square successive difference to th 
" "T 4 on 

"heory of Games and Economic Behaviour. Princet 


ression analysis. No. 49 Mimeograph Series, Institute 


[ 169 ] 


T 7 
HE APPROXIMATE DISTRIBUTION OF SERIAL 
CORRELATION COEFFICIENTS 


By H. E. DANIELS 
Statistical Laboratory, U niversity of Cambridge 


l. INTRODUCTION AND SUMMARY 

erial correlation coeffici 

me cient was 

distribution theory of such modified statistics by 
Madow (1945) and others. The 

lag from an uncorrelated normal 


Hotelling’s suggesti ‘ 
ggestion of a ‘circular’ definition for the s 


follow, 
R. L. p» considerable progress in the 
erson (1942), Koopmans (1942), Dixon (1944), 


exact di 
distributioni 
Process and tion is known for the circular coefficient of any 
more gi , i i 3 
> e generally, from a circularly modified normal process of autoregressi 
s gressive 


type 
- Quenoui ; 
uille (1949) obtained by the same method the exact joint distribution of 


Cirey 
dar : 
x. ES of different lags. 
: ct distributi 
istributions are complicated. and 
ı mean 


Istribut: 
utio a 
45) for ET the circular coefficient with knowr 
v , 
incorrelated normal process. It was extended to the case of a circular Markov 


Process by Leipni 
depen ds A hn ote following a method due to Madow (1945). The approximation 
lating ne : evice of smoothing summation over à discrete set of roots by an approxi- 
istributio, ip Quenouille (1949) conjectured a similar approximate form for the joint 
om M "à Watson (1951) and J enkins (1954) showed that the conjectured form could 
ct. Jenkins developed the correct analogous approximation for the joint 


distri 
tributio 
Mou coefficients of lags 1 and 2 with known means. 
ircular modifications the distributional theory is difficult and the field is largely 
ve an approximate table 


un 
explored. For testing i 
or testing independence T. W. Anderson (1948) gà 
coefficients with known and fitted means. Watson 


Signi 
| ii seguida for non-circular lag 1 
A B Án ^aa je Se modified non-circular definitions of the coefficients which have 
uSleSsive proce s distribution in the uncorrelated case. The case of an unmodified auto- 
ss has not been much discussed, though a method due to Bartlett (1953, 
„tervals for the parameters, and 


54) is 
avai 
ailable for obtaining approximate confidence ir 
f fit tests should be noted. 


Uenoy E 
I tille’s (1947) approximate goodness o 
thod of steepest descents is adopted to 


eriy © present paper an approach based on the me 
od distributions and to generalize them.* (For an account of 
see, for example. Jeffreys & Jeffreys (1956), Daniels (1954).) The analogue of 
ion of ified coefficient of lag 1, 


S a: a cm 
! with I roximation is found for the distribut 
own and fitted mean, when the process is of unmodified Markov type. The 
‘ons is found for an 


roxi 
nt mate joi - 
O joint distribution of m successive partia i 
i . The work on the 


a simple and accurate approximation to the 
was found by Dixon (1944) and Rubin 


e 
; Meth known approximate 


“ipnike 


re 
Ur Bressi 
n SSI Vi - 5 a r 
modified ho process of the mth order, circular modification 
arkov process could be extended to the general case but we have not done this. 
many of the results given here 


n notation, 


6) has independently obtained 
My usage. 


thed distributions. Of our inevitable differences i 
of the autoregression coefficients %;- 


f symmetry. 


b DILE, 
Yon Ve si 
AB 
the Ohsidera tion learnt that Jenkins (195 
on of the moments of smoo 
lifference in sign 


e, is adopted for reasons © 


Why ne 

igh, , "ost lik 

hig that likely to confuse is the ¢ 
of Bartlett and Quenouill 


170 Approximate distribution of serial correlation coefficients 


2. THE SADDLEPOINT APPROXIMATION 
Inthis section some general theory is developed which is fundamental to the rest of the work. 
We are interested in the distribution of statisties of th 


e form r = c/ey, where c, is non- 
negative. If cy, c have a joint probability density f(c,. 


c) the density for r is 
Mr) = | "eof (ey, rco) dey. Qu 
0 


Let M(fy, T) = E eToco-Te 
be the joint moment-generating function for [^ 


M (Ty. T) exists in strips of non-zero width c 
T planes. The usual Fourier inversion formul 


: e 
and c. We are concerned only with cases whe? 
ontaining the imaginary axes in the 7) an 
a is most conveniently written as 


f(c% c) = Ga f f M(Ty, T) e-Toco-Te qmm qm. (23) 


the integration being taken along the imaginary axes of 


T, and T, or any allowable deforma- 
tions of these paths. In particular, 


1 
J (cg, TO) = e nn, T) e Cyro qm qim 


= gu il I M(u—rT, T) eve dudT, e» 


: 5 jon 
where the integration of u = %+rT is taken over a similar path in the u plane. Inversio! 
of the transform with respect to u gives 


oO If 4) 
jë FG: Cy) e^ de, = ami [Mu-r?, T)dT, (e 


so that, when differentiation is permissible, 
= 1 foM(u—rT,T 
T Co f (Co, reo) e"'o de, = ai] D Dam 


and h(r) = m] DD dT. 
270 eu ao 


This is a form of Geary’s (1944) extension of Cramér’s theorem. 
However, we often want to transform TT to some other variable z. In (2-4) put 6) 
T = T(z,u), e h 
. is 
Where T(z, 0) maps the 7 plane on to some region in the z plane, and 9T [6z does not Tn 
anywhere on the contour except possibly at its termini. Proceeding as before, we fin 
Tad —rT,T Z) 
Hr) = gr f a [t diac | ja 
integration being along the transformed contour in the z plane. This is the form used in t” 
sequel, but the alternative form 


1 [(/[oMY ar -( sal 
h(r) = ail | (Ge z 0z 02 ) „ðu 


is sometimes easier to work with in other applications. 


(29) 


dz (27) 


u=0 


dz (28) 


u=0 


|| 
i 
1 


s 


H. E. DANIELS 171 
of integrals of type (2-7) when the statistics 


Ou i : 
r main task is the approximate evaluation 
le. In the cases considered it is found that 


ne calculated from a moderately large samp 
steep p can be written as (2) IDACOY UP where n is the sample size, and the method of 
Point 2 at scents can be applied. The contour is chosen to pass through a suitable saddle- 
decre » which V/(2) = 0. It is taken to be a curve of steepest descent, upon which [yel 
es most rapidly on either side of 2, and this is one of the branches of 
argyle) = arg Yê) 
a emere & Jeffreys, 1956). In all the applications considered here the appropriate 2 is 
axis to be real, the corresponding contour of steepest descent through it intersects the real 
Stea, E mal , and there is no other saddlepoint on the contour so that |y) | decreases 
evid, ily on either side of 2. As n becomes large the major contribution to the integral 
ently arises from a small neighbourhood around 2. 
y e usual procedure is now to expand the integrand in ap 
tm "wise integration obtain an asymptotic expansion in p d 
18 called the saddlepoint approximation. In that way we wo n 


articular manner about ĉ, and 
owers of n-! whose dominant 


2 i 
J VO. ae eT » 
ILS IDA da~ (osi e(t )] | (2-9) 
-1) But in our applications y(z) is such that either 
this stage, or at worst only dz) need be expanded. 


the r 
relative error committed being O(” 
ation, except that (27n) is replaced 


ve q egration can be effected exactly at > 
Ya eont term is still the saddlepoint approxim 

n i different constant. — 
a iscussing the joint distribution 
extension of (2-7) established in 
M(Ty To num) - Ee 

Tq, and let 
u) 


ofthe statistics "s = eco (5 = 1,2, +++ m), we make use 


exactly the same way. Let 
Tocot Ts eye Tm em, 


Wri 
it 
eu — Ther, T, tro Tet tm 


21,2,...,7 
T, = Tz Bar m (Berns? ^ 
8 


obability density ORT Ta ry fu ds 


b it 
° a suitable transformation. Then the joint pr 
1 Us are dz, ... d, (2:10) 
h(ry, 5m) = arle (Z1: - -> Zm) } [u=0 5 
Juated by choosing contours of steepest descent 


mately ev? 
ugh sa 
lication of §$ 9-11. 


The i ; 
Integra] can again be approx: ddlepoints, 4, ...,2,,. The details of the work 


in : 

ag planes of z,, ...;Zn passing 1 g 

Most easily studied in the actual apP 

3 CIRCULAR Markov PROC 

"m first di he di 4ribution of the circularly defined lag 1 serial correlation coefficient 

Wh scuss the dist arkov type with normal residuals and zero means. Let 
x en the process is of circular M 


d 
"Uem, be such that 


mss. KNOWN MEAN 


= =1,2 
Ly — Pst = Cg an” (s a , TA n), 
dent NO; 1) variables. The joint distribution of the «’s is 


W] 
here 6,, ...,€,, are indepe™ 
ya ep Git- TE 


dF = (1=Pexp{— 
(2n) — 9p(X4 9 + vaa +- FL, EO, E, 23] dz, ... dz. (3-1) 


172 Approximate distribution of serial correlation coefficients 
The sample estimate of p is taken to be r = c/¢y, where 

Co = Ti+.. HER C= a a,+...42, 12, +r, t. 
The moment-generating function for cg, c is 


MT, T) = BeTorvtte = (0 — pn)| A |-H, 


where 
l-p?—27T, —(p4-T) 0 0 E 0 0 —(p4 T) 
—(p-T)1-p?—2T, —(p-T) 0 ien 0 0 0 
As 0 —(p+T) 1+p?—2T% —(p4-T) ... 0 0 0 
0 0 0 0 


—(p+T) 1+p?-27, —(pu T) 
- (p^ T) 0 0 Om p -(p4T) 14-p?— h. 


Evaluating its determinant as a circulant we find 


d 2 2 orn D 
[Al - (pe TRU - 7 1,1, lte- (32) 
z p+T o? 
and so, writing u TET, 
M(u—rT,T) = =p") (1—2rz- zin (3:3) 
(1—2^) (1 — 2pr +p? — 2i 
L8 — 2pr-+p?—2u) 4 
where p+T ^ ü-*zr3) $ (3 4) 


The results of $2 can now be used. Taking T = T(z, u 


) of (2-6) to be defined by (34) "^ 
have 


oT (1—2) (1 29r +p?— 2u) 
E “i (1 — 2rz 4-22) E 


u( _ 10-9)ü0-23)0- 2rz 4 z2)hn-2 
u (1=2") (1 —2pr +p? —2yyhna * 
and (2-7) becomes 
h(r (n—2) (1 — p") ACID e pie 
2zi(1— 2pr + p2)5" (1—z^) Ry 


z(1— 2pr + p?) 


with poe = in (3°6) 


On examining the successive transformations 
1— 1—2pr 2pr +p? 
 3(p3 T^ 
which together form (3-6), it will be seen that the region | z |<1lis mapped on to the whole 
T plane cut along the parts of the real axis exterior to the interval 
{-(1+pP/20 +7), (0—py20 — ry, 
Any path in the 7 plane running from 7 —?co through the gap in the real axis to 7’ -- i^^ 


corresponds to a path in |z| « 1 running from e~ to c^ where r= cos0. The path 9 
integration for (3:5) is therefore of this form. 


zm =¢ E= 


H. E. DANIELS 173 


Consider the fi 
Bei yon G-a 3 (z— rfj? in the integrand. It h 
| arg(1— Saka) a8 r; through which the path of steepest E da E " as a real 
Points e-i ei = 0 crossing the real axis orthogonally. This is just the straight li mere i 
So far (5 , and we choose it to be the path of integration for (3:5 Landi E 
zi 5) is exact. b "- I r (3 5). 
small Eds . but if we are prepared to tolerate errors which are *exponentially 
neither + nor ^ gnitude O(A”) for some | À | < 1. the factor 1/(1 — 2”) may be nodi ores 
true and pui id s near + 1. This assertion is discussed in the next section; oni itt A 
ng z = r+iw(l— r2)? (—1<w<1). we find that, ignoring p. alana 


4, (n—2)(1— 793079 f" à 
Mr) Sr apr e p | ee deli’ 


Tantl) 0— 72)ka—D 
(3-7) 


= FAT (Jn +4) (1—20r+P 


he smoothing proced 
powers o 


2) dn? 
Which i k 
is Leipnik’s | 
Eius. ; ^". s approximation. T ure is thus equivalent to ignoring 
(1 — z^). If this factor is expanded in fz» we obtain on integration the 


Series 
hor) = Tn 10 -9[0— A LE 
| a ie +p T(4u+ 4) aT (3n+4) dr” 
5o T gare.) (3-8) 


e r G 
* Da (an + 4) dr?" 


Whic} 
"eh con : 
firms Dixon's observation that when p — 0 the moments of Leipnik's approxima- 
distribution for all orders up to n since 


lon 
abo 
ut the origin agree with those of the exact 
irst reduce to zero on partial integration. 


€ contributi 
itions from terms after the fi 


oR OF THE APPROXIMATION 
urred by the approximation. The effect 


(3-5) by an amount 


4. THE ERR 
ad to the error ine 


He 
Te w 
1 bon determine an upper bou! 9 e 
ving the factor 1/(1 — gn) is to alter the integral in 
2 EX 2)dn—2 
"u-£)ü- Serm as, 


z 
ô «m 
Since 
2 pha wi - r7), 
ores -np«20-7 [1-2]|21-]z|" 


Whe, 

Wt es 3i (-1«w«D- 

n |l 

=2] = {a apral 

a pawa cues 

[9| «4Q em $ 1- [+u -— 1— w?) dw. 

all except near w = + l, where the second fact 
or 


r+iw(l -F 


b 
ve have 
eeps it sm: 


Th, 

| e 

) first factor in the integrand k 

aye ~ dn —r®)(1—w?) only. Let i m Ts +f" ; 
o Pid where 


takes . 
0 Over since 1—Í7 


"x1. Then " 
ko pees 
ey prii- 
; 21 — r2 in 
ppe- 3B(3n — 1,3) (1 I, «(4m — 1,3) 


e 


2 pw? 
E in k 
7?) I, a- qp2)in-? dw 


174 Approximate distribution of serial correlation coefficients 


in the usual notation for complete and incomplete Beta functions. In k<w< 1, 


—w?) 
ESL 


L- pef Sa 1- pterea cese 


since it is a concave function of w?, so that 


q 0-2) : ] —w2)ir-3 d. 
(Sie oars vy " 


= (1-4k?) 
XO 1— [+k -r)i 
Hence | 


2 Set es [264 — 429 1 — 
lels prea P Un 1, (2420-1 fbn 9] 
(n—3) 


(n — 4) 
For each value of r there is a best value of k which has to be determined numerically. This 
gives an upper bound for || which when multiplied by (n — 2)/2z(1 —2pr-+p2)™ gives 2? 


upper bound A(r) to the absolute error in Leipnik’s approximation. (More strictly the — | 
factor 1 — p" should also be included when p" is not negligible.) | 


1B(1n— 1,3) 4 «(1n — 2,3). | 


+ (1—32) heln- 2, yj . 


Table 1. Leipnik's approximate density h(r). Error  N(r). n = 20 


p=0 p-05 
A 
r h(r) A(r) r h(r) A(r) r h(r) Alr) 
0 1-8066 0-0050 —1-0 — = 0-4 1-7511 0-0121 
0-1 1-6421 0-0048 —0-9 = = 0-5 2-0860 0:0254 
0:2 1-2258 0-0042 — 0-8 — c 
0-3 | 07375 | 0-0034 0-7 0-55 2-0870 0:0358 
0-4 0:3448 0:0024 —0:6 0-0001 = 0-60 1:9339 0-0490 
0-5 | 01175 | 0-0014 —0-5 | 0-0004 = 0-65 1-6221 | 0-0640 
0-6 | 0-0260 | 0-0007 —0-4 | 0-0023 = 0-70 11889 | 0:0776 
0-7 | 0-0030 | 0-0002 —0-3 | 0-0092 = 0-75 | 0-7185 | 0:0832 
0-8 | 0-0001 - —0-2 | 00298 | 0-0001 080 | 03233 | 0:0725 
0-9 — = —01 | 00817 | 0-0002 | 0-85 | o-oss4 | 0:0431 
L0 — = 0 0-1940 | 00005 | 0-90 | 0-0092 | 00118 
0-1 0-4059 | 0-0012 | 0-95 0-0001 0:0039 
0-2 | 07526 | 0-0026 1-00 — = 
0:3 1:2317 0-0056 E 


Calculations for the case n = 20, p = 0 and 0-5, are shown in Table 1. When p = 2 H- 
error is seen to be quite negligible in the tails of the distribution, but when p — 0*5 che " 
a possibility that the upper tail may be materially affected at, say, the 1% level, since AC à 
is in that region comparable in magnitude with the ordinates. The error may of course 
less than A(r) but cannot be ignored without a more thorough investigation. 


H. E. DANIELS i 
75 


5. 
CIRCULAR MARKOV PROCESS. UNKNOWN MEAN 


en = 
Vh the tru i W 0: T. wW. 
rue mean is unknown à suitable estimate f pis CIC he: 
P 0: re 


a B (8 tm - EP ee $ (£, — £P = Cy — n?, 
= (e, — E) (2, 2) +--+ (tu?) (e, —2)4- (5, —2) (6 — 3) = oF, (65-1) 


and T= ( 
= (x. A 
yr Ly Bass z,)[n. The moment-generating function for C, C is 


-3 


2 
A+- (T+T) 


M(f, T) = E eT«0s*TC = (1 —p") 


With Aa 
is í 
elements, tela. defined, and I = [1, 1, -+++ 1] $0 that ll’ is the n x n matrix with unit 
every row of A sums to (1— p*- 2(m 4 T) the determinant can be written as 
= |A| |I+AU |; "Te NE eor eI 
nü-py-20tT) 


2 
Az (ye T) 
The 
Second factor is 
o eur eN 1 'H $ l+nàÀ r 
O | T+A zio ORN =1+md, 
80 that | ə 
A+- (R+T i" mb co (p+ T3 0 —0* 
d 5 Ct T) acc sme =" eo ae 
nd hence 1 = d 
M (wu —7 = ce )0-2 (1 —2rz4- 2)? 
With zd M(u rT, T) 0-2") (1—p) (1 —3pr pi 2u)e" (5-2) 
efined by (3-6). Proceeding as in $2 we find 
E 
h(r) = (n—3) (1-2) [t —z)(1-#) (1—2re+ e) 
2mi(1 —p) a- 2pr + p)" a —2") T (5:3) 


With 
dut -pni(-1«ws 1). 
" and jae the factor 1/(1—2") again i entially small error. Neglecting 
aluating the integral we find 
- Lyn 
aTdn-Bü-7) ue (a weed 4r). 
sar»na-pü arp n 
the momen 


8 evi 
for all p from an expansion of the type (3:8), 
porto n mig up to n. The fact that (5:4) can be negative when 1 -r~ (n) 
Since approximations of thi y case break down near ? = 


orm 
of ¢) 
he approximation (equation (7-6 


ntroduces an expon 


(5-4) 


h(r)~ 
ts of r when p = 9 are exact 
is not in itself 
1, A modified 


s type in an; 
is discussed in $7. 


process. KNOWN MEAN 


6. NON-CIRCULAR Markov 
y defined sample s 


e ofa circular; erial correlation coefficient 
n a p40 of loss of power and sensitivity to extraneous trends. But 
be Stifieq if the circular definition of the process itself iggartificial and can on!) 
f ciroyu], the results arrived at by its use are not substantially affected by the assumption 
tion arity. We now examine the approximate distribution of the sample serial correla- 


COeffa; 1 
fficient without introducing circularity assumptions. 


N teg 2 
t; . 

a; Nu independence the us 
jectionable on grounds 


176 Approximate distribution of serial correlation coefficients 


The process is defined by 


V,— p. i = ês (6) 


for all s, the e’s being independent N(0,1) variables as before. The joint distribution of 
1,.25,..., 2, is now 


1—p35i 2 2 à 
sdb xs exp[— Hei + (1 +p?) (23+... a 1) t plt t+... em, tp) ldr Mn 
(6-2) 
The sample estimate of p is taken to be; = c/co, where 
C= Tito ttot... TE, am (6:3) 
Co = ritat... saia] 


We have chosen the intra-class correla 
and z5,25, ...,z,. Apart from its intuitive 
both here and in the next section where tl 


tion coefficient between the series 2,, vo wa 
appeal it seems to lead to the simplest analy 
ne mean is fitted. We now have 


MT) = Beente — (1 pog a 


where 
1-75,  —(p*T) 0 0 0 " 0 
(P+) 1+p?-2%, -(p+T) 0 ie 0 0 0 
B- 0 —(e+T) Lp -2T, (pm) ... 0 0 0 
9 a 9 E — (T) 1p 2m —(p+") 
0 0 0 


0 -(peT) 1-5 
Then | B] = 0,3091) G, a+ (9 T, a where Gig the determinant of a matt™ 
similar to B except that all its diagonal elements are 1+4p2— 275. Also 


€, = (1+ p*- 21) G, (p. ye Q, 
with G = 1, G, = 1--p?— 271. In this way it is found that 


IBI nU 7 eem] -epa " 


" " er: 
with z as before. It can again be shown that if 1 — 7? ig not small, omission of the term iP ^ 


within the bracket incurs an exponentially small error in the fina] approximation. For 
brevity we omit it at this stage and write 
(+T) | (p -27)z] 
18^ —25) (p+T) | å 
whence, after some reduction, 
M(u—rT,T)-- = 202+ 2) (1 — 99e. ay (6:5) 


(1 — 2pr + p? — 2u)?" [(1 — pz) (1 —Pr—(r—p)2) -u(1 ay] 
with z given by (3-4). Using (2-7) we get the result 


n(1—p?)t [690-2 2ybn-2 
ilm 27i(1 — 2pr +p?) cam s 


-1 


H. E. DANIELS 17 


where s 
— a had 
a-pa- pr- 0AA) nli- l-e) 


A(z) = 


and z = raz 
ee ee ae 
integr. 
ategral cannot be readily evaluated in closed form. We therefore expand ¢(z) as 
ish and we get the series 


a pow, 
er series in z x 
ies in z — r and integrate. The odd terms vant 


Ag a 
21 $() (1— 272 + 22)n-2 dz 
=s Ugel qo jay 1 E E ERO 
airgap i g-a- (+m 1) (n D) 
(-yü-n» P(r) 
tta arata . (66) 


Sine 
PR = r2)t/(1—pr) (1 O(n-1)} a first approximation to A(r) is 


i. phy 1-72) 

hr) wt Tn-1) 0 p» 14-0(n-1)). 67 

(0) ai Tn) r ma t pt e ey 
RENORMALIZATION 


THE APPROXIMATION. 
4), though there has also been neglected an 
of moderate size (say. n about 15) may well be 
the approximation adequately would 


The 7. AccuRACY OF 

remai ‘ 

E e in (6-7) is relatively O(n 
ntially small term which in samples 


Com: s. 

Suin, rable in magnitude. To study the accuracy of 

É e calculation of an upper bound to the error asin $4. We do not attempt this here, but 
tially small term to be negligible 


ugh for the exponen! 


Mstea, 
d merely assume that n is large eno i 
f orders of magnitude. While admittedly 


Co: p H H / 
Ot er ntent ourselves with a heuristic discussion O à 
itirely satisfactory it does give some insight into the magnitude of the errors involved, 


Suggests a device for improving the accuracy without much extra trouble. 
Serve first that the variance of r is O(n) 50 that values of A(r) outside some range on 
ape of p which is Oln) are negligi p can be regarded as O(n-3) 
Eo e range of r. : 
Y P the ia “uns first neglected term? 
eh erm is thereby altered by an 
mes part of the normalizing 


eit] n 
n bly small. Thus 7 — 


Over 
on, the one which is O(n). Ifr is replaced 


n-3), but it is now independent of r and in 
therefore legitimately write 


n the expansi 
amount O( 
constant. One may 


a-pa- 13:0 T : 
i E Bra agen 4: 0(n3) (7-1) 


ormalizing constant.* Moreover 


Over 
the effective range of r, where Kisan adjusted n 
2 pp 
tt Sa end 
1—3pr4p* a-P) 
a 2 
"d so "TET EN Olle 
c EE sg 
1-8 yea- olt- p? 
m ; =p]: i 
a [sl (73) 


proximation was used by Cox (1948) without con- 


* 
2. Lhe = 5 
ide. e device of renormalizing & saddlepoint apF 
ion of the magnitude of the remainder. 
Biom. 43 


12 


178 Approximate distribution of serial correlation coefficients 


Consequently, to the same order, h(r) may be written in Leipnik’s form as 


DQN+1)  (1—:21N-D P 
h(r)~ —~2 F 3 1:4) 
(7) „TOND (1 3pr x pris d + On 3 ( 
with N css: s a (1:3) 
l =p? 


Actually when p = 


: = s. 21s 
0 the relative magnitude of the remainder reduces to O(n-?), since? i 
now O(n-!) and the coefficients of successive terms become 


functions of 7? only. The distribution when p = 0 is approximately the same as that of ap 
intra-class correlation coefficient with known mean from a sample of n pairs. (The inter" 
class coefficient has one fewer degree of freedom.) But when px 0 the bie are not 
the same to the order considered. 

The same device can be used to simplify the approximation (5-4) for the circular coefficient 
with fitted mean. If errors of magnitude O(n-3) are tolerable the term (1+7)/” in 
bracket may be ignored and the distribution renormalized to give 


hr) ~ (n4) À (1—7) (1— rin- 
271 T ($n) [n(1 —p)—(1+p)] (1— 2pr X pio) Es O(n-$)). 


So to O(n) the distribution of r with fitted mean is th 
Jenkins (1954) for the first partial coefficient with kno. 


in the expansion of the integral 


(7:8) 


e same as the distribution found by 
'wn mean. 


8. NoN-CIRCULAR MARKOV PROCESS, UNKNOWN MEAN 


When the mean is unknown we estimate it by z = and 


estimate p by r = C/C,, where Gertat... E, G4 dos)l(n— 1) 
€ = (z,—&) (x, — 9) +... + (9,3 — E) (£, 3) = co Gay 
Co = (5, —2)? (v4 — T)? +... (c, .,—m)24. Hen- z)? 

with c, cy defined as in $6. Then 
MT, T) = (1—p)#|B Len 


)8?, 


= Cy— (n— 1) 2", 


mm’ 


where B is the matrix of $6, and m’ = [1, 1, 1, ..., 1, $]. The determinant can be evaluated 
as in $5, with some extra manipulation to allow for the fact that the sum of the first o! Jast 
row of B is not exactly half the sum of any other row. We omit the somewhat lengthy detail? 
and record the result that i 0-20 
—p*)t(1—z =2) (1-2 2)4(n— 
M(u—rT,T)~ (1—p?)? ( : Sac 1) 

(1—p) (1 —pa) [1 - pr — &—9)2— "Q0 —2)] (1 — 257 4 ga — uk 

(+p)? (1 = 2r2 +2) [@—p) (1—pa) (1 
(n—1) (1 —2) (1— 2pr t p? = 2u) [1 — pr — 


-$ 


—1)+u(1—2)] 
("—p)z—u(1—2*)] 
where a term in z"-! has been ignored for the reasons stated, and z is given by (3:6). 

If we are content with an approximation having remainder relatively O(n-!) the last 
factor may be ignored, the dominant term being taken as before in the expansion of the 
integral for h(r) and subsequently renormalized. Ultimately it is found that 

X( o9 pes M 

Mrs : E) (apr E pie. U+ Om), 


«fis 


| 
! 
f H. E. DANIELS 179 


where K i 
K isa eer E ; 
a normalizing constant. Following the method of $7 we can replace it to the 


Same 
order of accuracy by 


h(r)~  FTQNAD. q-2ü-7m9^ A 
ST) ENG - )- (1+) Scar noue Ea » 


wi 
b puni Ve pc eh. 
s aN of employing non-cireu 
case - nate distribution, to O(n-3). whether the mean 
class e when p = 0 the approximate distribution of r is 
relation coefficient with fitted me 


eplace n by N in the 


lar definitions is therefore to T 
is known or fitted. But in the latter 


not the same as that of the intra- 
an. 


ROCESS OF ORDER m. KNOWN MEAN 


ver the mth order autoregressive process. For simplicity 
he non-circular case could 


dstatistics are considered. T! 
ut the labour involved would be considerable. 


9. CIRCULAR AUTOREGRESSIVE P 


Ths discuss: 
Only ‘aeons is now extended to co 
9 TA circulanty defined process an 
i with by the methods already used, b 


liba. : 

process is defined to be 
Ret Oy Woy E pog E — Es: (s= 1, 2; m) s=Xn+s (9:1) 
he €'s are independent N(0, 1) variables. It may be 


Where py ; 
m is O(1) with respect to ” and t 


Conci 
isely written as 
Ax-€ 
Where A i 
e he is a circulant with first TOW (1,0, 0, -> 0, ms mo 77? a, 04). Its determinant has 
ue 
n ik 0r 
n n-m) = E : 
|A| = II vica siete mi ) i t) (9:2) 
Whe: j=1 
i w; = enl and 0 9. are the roots of 0" ag 07 + NOE 0. 
das . poeouUm € 
joint distribution of the xs is then 
(9:3) 


n 
-óyA'x [T da; 


1 il (2e j=l 


dF = agni 


first TOW 


Note 
t AL : 
hat A'A is a circulant with 
Oa etr + OO: Dt 


Og + Ha Ao 


S ren +&m-1 Am 
,&ı aS s. + CAPE On) 


Oat $y bm 7 


1 of the ™ circularly defined coefficients 7, = cleo 


(Ly 
[23 
itat... pod, o Fata Pas 


Am: 0, we Os Am 


Oa a bm» 


We ; 
consider the joint distributio 


S 


H 
>2,...,m), where 
five FE Mee 


Ey st Essi ia 
e, = ga Fart 7 TÜE,-sUm 


"Dh 
e a 
| Moment-generating function for the cs 15 
) AM (ffs. T i= potent tcc t 
Qs sy cero em 
ma op) | A- 9 |>, (9-4) 
[ms 
12:2 


E — 


180 Approximate distribution of serial correlation coefficients 


O being a circulant with first row (27's, Th, To, 
A'A.— O is just A of $3 with a, = — p. 


Let us introduce new variables P, a,, a9, ..., a, related to the elements of A'A— @ in the 
following way: 


mage 


m? 


i AE 


T, ass D). When m = 1s 


1 od a... o3, — 2T, = P(1-- a4 a$ 4... a2). 
Ay +AA + Oy gt... +m 10 — T, = P(a, 4 0405 4- ... G4 404); 
Og by y+ 03 014 «b Ot e, — Ts = P(g + Ay Ag +... sca 0), (9:5) 


Oc a + 04.06 — UT = Plama +a am) 


Am— Tn [^ Pan 
Then | A'A- 6| = P^ [T (1 gps, 
t=1 


where 4y,...,¢,, are the roots of gr +a p™=1 4... +a, = 0. In the special case m = ] this 
reduces to P^[1— (—a,)"] with 


1 rm 
P-(m-T)m, a41-1*94-?T 
ay w- ’ 
which is identical with (3-2) if p = —e,, z = —a,, T — 7. The present method is thus 


a natural extension of the one used for the Markov process. 
Let u = +r Ti+ r9 T, - ... - 7,, Tn and write 


m-^m 


E 2 2 
Q(a, r) = (1c ad... a2) + 2r (24 4-045 +... 0s 10,5) t.o 2r, gs 


1 
= (ie Re [al = 1+2a’r+a’‘R,,_,a, 


poer a! = [0,45 ..., G5], T = [rry sss] 
and 1 "n "S Ek a 
T, 1 Ti Ts E 
= H r’ 
R,-|r mn l Ti fus FIR | 
m-i 
Tm Tm-1 Tm—-2 Ta-8 ct : = 
Q(a, r) 
From (9-5) we find P= Qs, r) -2u "m 
and (9-4) becomes 
Qa r) zu 
= A a a E) EEN " 
M(u—r,T, — ...—r,, Ti; Ty 7 [Q(«, r) — 2u s laco. (9 ) 


The inversion formula (2-10) can now be used with a; in the role of zj. The successiv? 
transformations 


MN 
TP ney Dig P Os Ps 2am W Ar Aasa., A 


h H. E. DANIELS 181 
ave Jacobians 
| àd * ait aa d) ay aa sas Qm- Qm 
(m.m Ay Oy Ay +--+ + Gam l4dg djdüa ++ Om-2t Um Qm- 
ip E T! , 
a ence E M 
| 
Qy-1 + [2177 Gn 0 wee 1 ay | 
Qn 0 0 0 1 | 
= 4( — yia PmJ(a) 
where * 
i ay Ay T An-1 An 
(4 ]445 % Tag cem Qg-2 t Cmn aul 
J(a) = ay as 1+ Am-3 Am-2 
a. 
An—-1 Am 0 1 1 
0 1 
Gn 0 0 
ü 
nd OP, Gas dn) = (5) z gei 
ow, Oy s am) ou a3, «5 0m ( : 
H —9u]"* 
" (I, T s To) = (— y IQ T sa), (9-8) 
Q(u, d4; -+> m Q7 (a,r) 
is 


and 
from (2-10) the joint probability density for fx «Im 
m 
(1-9) jn-m-1(a,r) J(a)d dors el 
Il lQ (a, r) (a) dar m, (9-9) 


1 T») ci : 
H a-g") 


Aj mes 
i (2mi)" Tün-m) Qin(a,t) 
are chosen to be the lines of steepest 


nes of dys -+> m 
p š á @,, which satisfy 


Th 
© paths of integration in the . 
addlepoints “1 aig 


pos Q(a, r) passing through the $ 
le aQ|2a; = 9 j= 1, 2, 4m). 
A i Rei 0. (9-10) 
uares estimators of the «;'s. Also 


The 
Saddlepoints are thus the usual least-8d 


A = l-r 


Q(à,r) - 1*T* "Rar |R |I| Rm- 


Sin d 
«(à Q he li f steepest de: t, i.e. " 
i Me r) is real, Q(a. r) must remain real on the lines 9 steepe scent, i.e. if a 2 £ J- iv 


EY 
l Must be such that JQ(a.r)- ex (r Roi ATTE 
à i : "Ht — ^ i . ó 
© Straight lines £- à; satisfy this condition. On them, % = % in; and 
[Ral Rav (9:11) 


Q(a. Lie pm 
Wh; | 
i hich is a decreasing function of each | PAG They are therefore the required paths of 
Bration. 


182 Approximate distribution of serial correlation coefficients 


Since Q(a, r) = 0 at the end-points of the paths (corresponding to FI(T;) = £0 in the 
T; planes), the domain of integration is 4’R,,_.< | Ralf Rails 

We shall again ignore the factors 1 — 07, 1—/' since it is not difficult to establish that 
except in critical cases | 0, | < 1, | ¢,| < 1 for all t. (This is equivalent as before to a smoothing 
operation.) The factor J(a) is a polynomial of degree m 4- 1 in the 7; s, and if retained woul : 
lead to an approximation to h(r) with an exponentially small remainder. But for si mplicity 
we shall replace it by the constant term J(à). The resulting approximation to A(r). when 
renormalized, will be in error by a factor 1+ O(n-3). We then find 


ns 8 [ Rd s je, a (0 
Hes (27)" Tn — m) ory "| Ten R, 45 dy, diy: | 


The integral is readily transformed to the Dirichlet form and has the value 


gim In—m) | R,, |Eo 7-1 


T'(3n — ym) | R,, [$em 


The factor J(â) can be reduced to some extent as follows. Write it in partitioned form as 


Ar 


a 
D 
where D is am x m matrix having elements d; = @,_;+@,, aj f we define â, = 1 and s; ^ 

whenever j > m. Also (9:10), which is 


1 


A 


J(à)- 


, 


m 
^ 
Tt 2A 1 5-1 E, = 0, 


m 


can be rearranged as â+ E dyr, = 0 

k=l (9.19) 
or 4+D’r = 0. i 
H ^ IA | Ra | 

ence 87 (oe 9) D | ren ek 
m—1L 

and the approximation to h(r) is 

TE 9-14) 

Ar)» K Rm | — LPL og pom. ( 


KR... dn—m42 ( An, a, r) 
m il Q ( 


^ k hi 
Since dmj = 0 (j+m),dinm — 1, |D| reduces to an (m— 1)th order determinant y i 
elements are functions ofthe s. As will appear in the next section there is no need to eXP 
it in terms of the "ys. 


ose 
ess 


10. THE JOINT DISTRIBUTION OF THE LEADING PARTIAL SERIAL 
CORRELATION COEFFICIENTS 2 


E in ter ial correlations: 
The distribution simplifies considerably when expressed in terms of partial correlation an 


use the convenient notation rj. to denote the partial pr ELE PAP 
Zs; conditional on fixed z, ,,, ..., X441 NDA UN Ro aren ener 
correlation coefficient. It is a standard result that 

[R,| = 1—3 0 77057 0 081 Qr) aen 


and also that Ân = Tiss 


^ 


183 


^s. From (9-10) and (9°13) 


H. E. DANIELS 


The fir: : 
st 
Te lave.0 siop is to make a transformation from the 7;’s to the à; 
=r+R,,,4=4+D'r = q say, 8° that 


2q , eg or xs 
Eam R,44D 28 


and the requ; 
required Jacobian is (10-2) 


Los dados 
notation d; 0; m to 


ed. In this notation 


*s, Let us introduce the 
rder m is being consider 


Next 
exhibit Dis transform from the @,’s to the 7j. 
Xplicitly the fact that a process of o 


A 
q, 

cm x m 

j. and (9-10) is rj Xi rium = 0. Subtr 
i-1 


acting the corresponding equation for 


Orde 
r 
m—1 we obtain 
m-1 PS m 
a7 aln Ao * Tg -j mm — 0. 
i-1 
Co 
Mpa; B A 
rison wi m-1 2 x 
aa x 7 i-i m-i m-1 +1 n-j 7 0 
j a 
By i 
es ^ ^ ^ à. (103 
jm 7 Qi,m-1 T gi m-l m,m* ) 


Keen: 
e Ae: , 
bing @,,, ,, unaltered, we can use (10:3) to transform the remaining variables 


^ 


^ ^ 
to Qy,my **7 G,4,-2,m* [S 
^ ^ 
a. S Im—2,m—-V? m-i, m-1' 


^ 
Q4, m-1? Go, m-r ` 
] ^ 

reduces the variables to 4%, ; 


Re 
The =-~ H 
ition of the procedure ultimately r 7. as required. 
h order determinant 


e . 
Jacobian of (10-3) is the (m—1)* 
ay 0 0 m ji 
l 0 0 @n,m 
0 0 1 (p 0 0 x 
PARERE D oso DOO NAR 
1 
0 @m,m Opes v | 
0 0 0 L2 


â, 
m, m 
n. Its value is 


Whi 
? , is eve 
h has a central element 148, Whe? mi 


m-1) m odd. 
= Ao r)i T 2 } 
AS a7 yao meven, (10-4) 


reduction to the required variables is Hm H,, 4... HH: 
y obtain from (9-14) the toan 


te 
“a g (101) we finall 


and 
Comin Jacobian for the ultimate 
;"bining this with (10:2) # aoas 


loi 
oY probability density of the 7j: 9 
po [pr 0 — rj.) (1 =n) TR O(n-3y,, (10:5) 


Lcd a II a 5$) jeven 
Qr (a, r) j oad 
Tt is apparently not possible to decompose Q(a,r) into 


bei t 
ng a ;;ing constan": ] ; 
ston ngon the guccessive partial correlations. 


184 Approximate distribution of serial correlation coefficients 


The most important application of the distribution (provided the circularity assumption 
can be tolerated) is in testing the hypothesis that 4m = 0 by means of the statistic r,,.. es 
Zm = 0, Q(«, r) does not contain r,,, so if m is odd f». has density proportional to (1 — 5 s 
to O(n-3), the same as the null distribution of 7. while if m is even its density is proportional 
to (1— rm.) (1—72,.)5"3 to O(n-4), agreeing with that of r, found by Jenkins (1954). "M 

It will be remembered that if J(a) had not been replaced by J(à) in (9-9) the remainder 
would have been exponentially small. When m — 2, J (à) is a cubic in a — à but the odd terms 
vanish on integration and there is just one extra term. Calculation shows that the term can 
be absorbed into the normalizing constant so that the approximation has again a” 
exponentially small remainder. But when m > 3 this is no longer the case. 


1l. CIRCULAR AUTOREGRESSIVE PROCESS OF ORDER m. UNKNOWN MEAN 


" ioint 
Finally, the effect of fitting the mean is considered in the manner of § 5. We require the T 
distribution of r, = C,/C, (s = 1,2, ..., m), where C,=c,—nz*. The moment-generating 


function is 
n 2 d 114) 
MUST, +s Ty) = IL - 67) A'A- 0 (T 4. T). ( 
t=1 
Each row sum of A'A — O is 


(La Lbs — ATAT.. T) d P(Y a, 4... + an)? 


from (9-5), and as before the determinant reduces to 


(l+o,+...+0,,)? 


; 11:2) 
P(L£, 4... £a, j| ^ A- 9. i 


so that 


mim 


MUD, ..— n, T, T, D) 


tarea, —Qieu«ag) m (0-00) (13) 
Atat... +m) [Qla r) — 2u]f-9 j= (1— 07) 

The effect on h(r) of having fitted the mean is thus to reduce n to n —1 in (9:12), and b 

introduce the extra factor 1 +a} +... + &,, into the integrand, with a suitable readjust? n 

of the normalizing constant. For an approximation with error O(n-3), the factor can a£^ 

be replaced by 144, +... +@,, and taken outside the integral. 


We have therefore only to evaluate 1 T, Lu. E, in terms of the rj. s (now with 
means). From (10-3) we have 


, fitted 


^ 
V+, mt... T, = a +Â, mat ee, AER ea ea) (a +@n,m) 


^ 
" =... = O +êm, m) (04-0, 4,3) (17842 
or, in the present notation, 


á " qr? 
L8, F0, = (157) 1773)... (1— 04). 
Hence the approximate density for the r;.8 with fitted means is, from (10-5), 
K E ar, 
Gio sj, I 7730 7 je- qp as jare Oh 


j even 


where K is anew normalizing constant. 


H. E. DANIELS 185 
5 


Thus i ing = 0 w ean i: known, 7, n b ki X; 7 
n tes i 
ting an 0 when the mea s un "n. can be taken as 
m- appro imately 


stributed with density proportional to 
(1—75,)(1—725,)390-9 when m is odd, 


and 
Jenkins (1954) gives Psi d " un M Pe 
moments of fs. with fitted me wis releases Eas UE 
Their values E Ar jes quein, Ne en 1) eei RIS 
tion is actually o ead pee i an anoni O(n). Since the error factor in our approxima- 
moments are raig dn +gP+..., it 5 -— to show that its first and second 
y in error by amounts O(n) and O(n-*). We find 
Which i E(r.)» —2](n ne —2/n+O(n~) 
sith Jenkins’s value to the required order, but 

Which E. Bes )~(n+5)/(n+1) (1 2) 4 1n 4 2/m? O(n) 

with Jenkins’s result to O(n-?) but not with Dixon’s. There is a similar dis- 


8; 
Steement with Dixon’s E). 


2 


Iam; 
indebted to Miss P. A. Johnson for the computation of Table 1. 


REFERENCES 
the serial correlation coefficient. Ann. Math. Statist. 13, 1-13. 
f testing serial correlation. Skand. AktuarTidskr. 31, 88-116. 
ika, 40, 306-17. 


fidence intervals. II. Biometr 
mation. Discussion on the papers. J. R. 


Ann 
ERSO; 
NDERSON” T L. (1942). Distribution of 
ARTLETT 'WL W. (1948). On the theory o 
ARTEEI, E 8. (1953). Approximate con 
: M. S. (1954). Symposium on interval esti: 
ika, 35, 310-15. 


Cox, hi 16, 208-9. 
DANIEL Ap e A note on the asymptotic distribution of range. Biometri 
x > H. E. (1954). Saddlepoint approximations in statistics. Ann. Math. Statist. 25, 631-50. 
to the problem of serial correlation. Ann. Math. Statist. 


XON, W 
15, 11« J. (1944). Further contributions 

Crary, 10-94. 
D teli, (1944). Extension of a theorem by 
Erpgz e ont of two variables. J- R. Statist. Soc. 17, 56-1. 
J iive H. & Jurrreys, B. S. (1956). Methods of Mathema 
p» TSlby Press 
INS, SS. 

San M. (1954). Tests of hypotheses in the 


Statist. 


Harald Cramér on the frequency distribution of 


tical Physics, 3rd ed. Cambridge 


linear autoregressive model. I. Biometrika, 41, 


II. Biometrika, 43, 


dy 
NS, » " " H 
o, 189-09; M. (1956). Tests of hypotheses 1n the linear autoregressive model. 
PMANS i ; F : . 
i rati vi . . Math. Statist. 
Las gn (1942). Serial correlation and quadratic forms in normal variables. Ann. Math. Statist 
orrelation coefficient in & circularly correlated 


NIK 3 
Univer B. (1947). Distribution of the serial € 
Ow, Em apr Math. Statist. 18, 86-7. - "m 
Statics 41", (1945). Not the distribution © the seri 
Sot 16, fem Note on | 
Q d A e M. H. (1947). A large sample test for the goodness of fit of autoregresst 
Unnoyy latist. Soc. B, 11, 68-84. 


relation coefficient. Ann. Math. 


May, 
ve schemes. 


n of serial correlation coefficients. Ann. Math. Statist. 


ULLE hys SLE ET 
Rupee M H. (1949). The joint distributio 
» H. (194 3 „rjal correlati fficient. Ann. Math. ist. 
Wallis (1945). On the distribution of the serial correlation coefficien nn. Math. Statist. 16, 
Si ý " a B " 
EN orrelation in regression analysis. Ph.D. thesis, University of North 


S CCS i 
Ww Citai: - (1951). Serial c 
Arsoy ina (unpublished). 
"d p) S. & Dursry, J. (1951). 
ath. Statist. 22, 446-51. 


Exact tests of serial correlation usmg non-circular statistics. 


[ 186 ] 


TESTS OF HYPOTHESES IN THE LINEAR 
AUTOREGRESSIVE MODEL 


IL NULL DISTRIBUTIONS FOR HIGHER ORDER SCHEMES: 
NON-NULL DISTRIBUTIONS 


By G. M. JENKINS 
University College London* 


l. INTRODUCTION 


It is known that the likelihood ratio criterion v, for testing an autoregressive schem i 
order k—1 (written as A.R. (k— 1)) against the alternative hypothesis that it is an "o 
is given by the partial serial correlation between a, and %;_; When the effects of the I pe 
mediate variables have been eliminated. In a previous publication (Jenkins, 1954, ie y 
referred to as I), it was shown that the smoothed form of the distribution of Up is given ^: 

Tn+4) 


203) = TAYE Gn 0 - fe (0 — v, 
Ta—r? 


i, 
> " . 7 meat 
m. and r, is the circular serial correlation of lag k uncorrected for the 


e 

The distribution was first derived for a random scheme and then shown to have the 587 

form in an A.R. (1) so that it may be used to test an A.R. (1) against an A.R. (2). 1 
In the present paper, it is proposed to extend this to the case where the serial corre tind 

are corrected for the mean and, also, to construct the relevant distributions for um is 

higher order schemes. It has been found that up to order 4, the distributions when 

no mean correction are alternately given by 


Inl) | ( 
rA Pn 4-3) even. 
and (1-1) according as to whether the order of the alternative hypothesis is odd "lagen 
This result has been proved in the general case by Daniels (1956) using the more 
OR eno ae for the mean, it will be shown A Ex. i 
tributions of the lower order partial serial correlations are alternately given by (1-1) ( 1:3) 

PO = (BQ, 3n — 1) + BG, 1n— D)? 0-A ae, jenas 
depending on whether the order of the alternative re Dey ce ne ine This a (1: 
also been proved in the general case by Daniels Caron ap pee s by (1#) 
and (1-3) have been designated by type ei WE od E [M of circula" 
Ms vod hee MXN TERN m derived, and detailed pore 
of these distributions given for the Markoff scheme. The Yule scheme will be discus 
at a later stage. 


e of 


(1-1) 


where v, = 


ations 


ie ae2)kn—)) 


P(x) = 


* Now at the Royal Aircraft Establishment, Farnborough, 


G. M. JENKINS 187 
d 


2. T 
2. 'THE RELATION 
RELATION BETWEEN DISCRIMINATION AND ESTIMATION 


The procea 
ure envis i i i 
general procedure : e in testing specific hypotheses in an A-R scheme is part of a m 
—1 e .8 or 
discrimination for stationary time-series. This involves a decision = 
sion as to 


the Natur 
voci model strueture between the three families of alternative hypoth 
| i by the autoregressive (a.R.), moving average m.a.) and ned A v 
Procedure a in the manner which is now being advocated by Rudra (1954 1955) Nee 
etween A en a modification of Whittaker's periodogram, enables a decision to ies 
p PN E ian or A.R. schemes. Tf the latter is adopted, it is suggested that m 
Y Rudra Yi. _A. schemes may be run simultaneously using à method previously given 
_ We sh à 
B oue the procedure for pic 
'ypothesis Lon (or briefly, T-discrimination), 
his Paper within a particular model type by 
, we shall be concerned with @ metho 


®pplica’ 
m EE to fairly short series. 
Aspects o f po to point out that discrimination and estimation are complementary 
Ne AR, Een. inference problem for stationary time-series. It is a convenient property of 
Seriminagi e that these two procedures may be carried out independently of one another 
Problem ae preceding estimation. This is not so for the 3.4. scheme where the TOROS 
uces to a trial-and-error process in whi tion and estimation cannot 


- ates 1 ch discrimina 

iis the Rohs n the case of the am scheme, We are in disagreement with the view that one 

947) or mie and then tests the goodness of fit by means of the methods of Quenouille 
hittle (1952). It is not possible to fit the scheme until the order k has been 


ermin 
le tests a and this is precisely what the goodness of fit test is capable of doing. In fact 
Whittle and Quenouille may be expressed in terms of partial serial correlations. 


king out the ‘best’ family of hypotheses by Type- 
and that for selecting the optimum order of 
Order-discrimination (O-discrimination). In 
d of Q-discrimination in the A.R. scheme 


PROACH 


ERAL METHOD OF AP. 
it is necessary to distinguish 


DES. 3. THE GEN 
be ee ra ibution theory of serial ¢ efficients, 
i (i) The ae seemingly independent ¢ 

Possible Ms * circularity of the ser 
in 4) The Di erive exact distributions 
tbutions ixon—Koopmans smoothing 

nt 

ep ates of random series, it has been shown that ther xists 
Sumption a between the smoothed and exact circular distributions, with the consequent 

at the former provide adequate approximations to the latter. When the 
f these distributions do not agree and it 


Vari 
lab], 
| SS 
are autocorrelated, the first moments © 
are no longer satisfactory. It is 
to approximate to the distributions 


hag B 

Suggested concluded that the smoothed distributions 

9f the ier this is to lose sight of the initial aim, VIZ- 
n this -circular statistics. 

m Dothed ave it will be shown that t 
the di distributions tractable but als 
is stributions of the non-circular S$ 

provides a partial solution to the c 


orrelation co 
oncepts: 

jal correlations jntroduced by Hotelling, which makes 
in the first place. 


technique which leads to more tractable dis- 


e exists high moment agreement 


nsiderable evidence that not only are the 
„vide much better approximations 


o that they PT 
do the exact circular distributions 


tatistics than 
ontroversial problem as to whether one is losing 


here is cO 


188 T'ests of hypotheses in the linear autoregressive model. II 


power by working with circular statistics. It is suggested that one should always calculate 
non-circular statistics and use the distribution theory of the smoothed circular statistics. 

Inl, the author derived an ad hoc method for constructing p,(v,) (the subscript s indicating 
that the distribution has been smoothed). This method of approach may be generalized m 
the following manner: 


Writing the partial serial correlation between x; and Zip in the form 


= Bla 
vy = BR DIR. 


where | I m % Th 
| 
Relan | m UE 
| 1 | 
Te Tea 


be 
and Ri" is the cofactor of the first element of the (k+ 1)st row of R, the method may 
considered in four stages: 


ions 
xs g «tribution 
(1) v zr, vs, +++) Ug are assumed to be independent as far as the smoothed distrib 
are concerned. 


(2) r, is expressed as a function of 3, Us, ..., Uy. 
(3) Using (1), the moments of % may be derived by substituting for the known v 
the moments of Vis Vos ..., V, and ry. evi bution 
(4) The joint distribution of Pi» Us; ..., Vy may then be transformed to the joint distrib A 
Of T1, rs, ...,7), and the latter shown to have the same moments as the smoothed distribu 
It may then be concluded that stage (4) justifies the assumption made in (1). ribe 
The method for deriving the moments of the smoothed distribution has been dese 
in detail in I and will be used quite extensively in what follows. 


alues of 


4. NULL DISTRIBUTIONS FOR Vj AND Vz 


It is now proposed to extend the work on p,(v,) considered in I so as to cover t 
a fitted mean. It was pointed out in $9 of this publication that the method for vs 
be used for Vy since: 
(1) p.(7,) was not known. ie ides incorrec* 
(2) The assumption of independence of 7, and v, was not tenable since it led to in 
bivariate moments for the constructed distribution p(7,, F2). sinc? 
It is necessary to revise these statements in the light of recent work. (1) is not P Dixo 
®,(7:) may be constructed from a knowledge of the smoothed moments given by LT 
- (1944), viz. 


f 
he case ? 
could 2° 


(Bb-1)Qb-3)..1 —— udi 
Mak = Gy 1) (w+ 8) --- (n 25— 1) 
(2k —1) (2k—3)...1 E 


Kk- (n — 1) (n+3).--(m+2k—1) 


Dixon fitted a type 1 eurve using the first two moments, but it is obvious that since they 
satisfy the relationship ntl) iy 
Hek- = -( ted 


n—1 


G. M. JENKINS 189 


these ar 
e are the moments of the function 


ie T(4n+3) pyi- ntl : 
pala) = rra OW D i d Ki 


ed by writing p — 0 in equation (5-4) 


This i li 
s, 

in fact, another form for the expression obtain 
+0 is treated generally in the next 


given b 
Section, y Daniels. The extension to the case where p 
m in this section that the assumption of inde- 


As f: 
bende ar as (2) is concerned, it will be show 
Spean, of 7, and i, leads to correct moments of the form é(7275), while those given by 
burp E $) differ from their smoothed values by amounts which are negligible for practical 
Ses, 


h : d 
© assumption of independence of 7 leads to the following equations: 


and vs 


U EUr) = £09) 40 7D" (64) 
Si 
ag (4-1) and (4-2), these may be solved successively to give 
n 
= -6ü0-9rp 
+2) 
__n(n+3)_ = pc 
npe t 2) ara) 
2) (n4- 4) 
4 " n(n+ 2) ( 
IET = 4 PY my m+ eer 5)’ 
et ^ 
©- and these result in the following forms for the lower moments of Da: 
A 1 Gt i) 
L) E) = 532ln—-V" 
3 n— 
6 un 3 (= i) 
e0) = — =O 469 = GED WH nT 
Hu 
t is not difficult to show that the general results take the form 
9(2k— 1) 2k-3) is 
saz) 7 7-10) (n4-3)- a 2)’ (4:5) 
=1)(2k—3) a1 
+4k-1 (2k-1)( " 
(e = (e) m+ (n+)... (e 2k)’ pu) 


nd the latter may be rearranged to give 
2h +1) (2k—1).-.1_ onore Bice ‘}. 
n 


do NE n Lets eon n(n-42).. m+ 2k) | (n+2) (a+ 2k 
n—1 : 


(n+ 2). 
noments of the function 


t these are ther 


lt 
follows immediately th? 
k(1— aefa -a-i (1—*$ a), (4:7) 


p) E 


2 
Where gie B el Iin-D- , BG, ine 9. 


190 Tests of hypotheses in the linear autoregressive model. II 


It may be seen that the dominant term agrees with the saddlepoint approximation given 
by Daniels; it is also a type y distribution to the first order. :ned by 
It is necessary now to derive the moments of the constructed distribution obtaine lity 
transforming p(7,, 7,) = P(7,) p(7,). The problem is not entered upon in its greatest genes 
here, since the method is similar to the one used in I for p(v,). For example, the mon 
Mə, of the constructed distribution may be evaluated as follows: 
EFT) = 462) (5, (1 — 79) 


= hag 
(n — 1) (n+ 1) (n+3)’ 
and this agrees with the value derived for the exact moment (and hence th 
moment if its order is less than n) by the author in I. 
In a similar manner, it may be shown that 


e smoothed 


EF, Fa) = ET) + E(B.) etr I= n)5 
EAR d 


NEUE , 
rs 
which is to be compared with the value given in I, viz, i 
that moments of the form Hs, are correct whilst those of the form Fies- do not agr 
the moments of the smoothed distribution. d, the 
It may thus be concluded that as far as the smoothed distributions are n. 
partial serial correlations are not; independent when the seria] correlations are Dan. 
the mean. Also that the type 7 and type Y distributions in this case are not Sm 
distributions but represent dominant terms in these forms. 


B pe: 
—l/(n— 1). Generally, it Es 


5. NULL DISTRIBUTIONS FOR Uz AND v, 
The general method will now be applied to the 


1—r5) —ryrs(1 — r5) +75(1 —72) 
v, = PER, — oi T5) — r,S( 2) +73 i 


R, 
This may then be simplified to give | ) 
(ra— r4) +r = 79) 1 — 99 e 
P a-50a-s 
and then rewritten to Satisfy condition (2) of $4 in the form (52) 
rs = 0,(1— TR) +r —7 (1 — 7 l nar (3 
where (1— 72, = (1—72) (1—28) -- (.— 


RR G. M. JENKINS 191 
simi $ 
ilar manner, it may be shown that 
m-1 7", T2 Ta 
T | rs- 1 fu fs 
v, = Rp5[R, = 1+ | 3 Hs. 
0 DP is 
This |n- 7n 1 
ma ss . č 
y be simplified using (5:2), yielding 
=r) n- va) — v (1 vy). (5-4) 


and rewri he DE Poo) = e RAS 
itten to satisfy condition (2) of $3 in the form 

(1-23) (1 v9) 

+d- (1-28) (+v). (5-5) 


rmulae for calculating vs and v4. The 
nal determinantal forms is 


(v,—1 
)0— 73) = (r,-1)estü —7 Ao 9i 778 


mple fo 


Equati 
atio; 
ns (5:1) and (5-4) emerge as Very si 
tead of the origi 


Saving į 
sinl i 
considerable bour by using these formulae in$ 
€ noy d 
the mom V proceed to determine the smo 
ents of v, and also that 


othed moments of v, using the known results for 


&(353) = 9, | 
Qi-1i-3 1... " 
cy] (5:6) 


e) =O = +2) (n+4)-- 

dition (1) of $3, it follows immediately 
then expectations, it follows that 
lve expectations of odd powers of r, 
6). Proceeding in an analogous manner, it may be 


that the odd moments are all zero and it remains 
method used in the previous 


nd using con! 
th sides and 
ng terms invo 


that D mein throughout in (5:2) % 
ae) ang = 0. Taking the cubes of bo 

i Re ai &(v,) = 0 and the remaini 
shown Which are known to be zero from (5° 


b 
Y means of an inductive argument 


ani 3 

Section Ms even moments. These May be derived by the 

» viz. by considering (5:2) raised to successive powers. However, the algebra is 
y may be written in the form 


sli 
Ehtly gi 
Y simpler if we proceed as follows. Since (rg 


(1-09 a -»9" y (1-08) (129. 


D Lrs 
a-pa- Ti 


it fol], 
o 

WS by taking expectations that 
n 2 eno qe 
1 p T eC ea 
ti 
S Possible to show by differentiation of the smoothed characteristic function in the 
Tt follows therefore that &,(r1—— r3)? = 2/(n +2), and 


nne : 
Subst; NS in [that (rs = 9 
tion in (5-7) yields &4(%3) = 1/(n+2)- 


farce: 
similar manner, we may write 
4) (n +16) = 6n(n+ 12) 

6) m+?) Q8) (58) 


3(n4-l 


dy = Fat 3) (n+ 4) (m+ 


192 Tests of hypotheses in the linear autoregressive model. II 


and the left-hand side may 


be evaluated using the following smoothed moments which arè 
obtained quite easily: 


& (8) = 0. (59) 
E F n+12 (5:10) 

74599 7 G3) (cay Gy 
ae (5:10) 

6 (7315) = (n+ 2) (n +4) ( (n +6)" 

12(n +7) 
E E d o. 
Hence ara — 7) (13) (n+4)(n 46)’ 
and substitution in (5-8) yields & (04) = 3 


(3) 44)" 

Continuing the process, it appears that the moments of 
this is quite easily verified by writing down the joint 
forming to p,(r,. 72,73) which may then be integrated t 
structed distribution has the correct marginal distribu 
It has not been found possible to treat the problem o 
of the smoothed and constructed distributions in it 
be illustrated in the case of (r2: r,). i 


From (5-2) we obtain an expression for the mome 
the form 


ct 
v, are those of type a form. In Ee. 
distribution of v4, v, v, and tr? 


on- 
o give p.(7,), showing that the © 
tions, 


nts 
fcomparing the multivariate mo P. 
S greatest generality. The method " 


+ on iD 
nt of the constructed distribution ! 
4 (rin) = FP) -EPDE Lege 

E (2s — 2) (2s — 1) (2s— 3) agl 
(n--2) (n4)... mF 354-3 
The corresponding smoothed moment may be deriv 
generating function which is obtained by writing o, = 


(512 


t- 
ed from the smoothed mome” 
0 and k = 3 in (6-7), viz. 

n [(?7 
Ps(9o; 91 05, 03) = exp [- al. log (1 —65— 0, cos o — 0, cos 2g — 0, cos 3a} da] : 
From this we may obtain &,(ri*-1r5) by differentiating partiall 
4, = 0 = 0, and then expanding this in the form of an i 
x = 0,/(1—9,) (for further details, see $7 of I). Thus 


3 in£ 
y with respect to 0, plac 


nfinite series in power? 


| [ AR j^ —0,9—0 n [*"  cos8ada 
va) n sex = og {1 0 1908 a) da, E 3 
[as Dt PL” ado 27 Jo (1-6, —0, cosa) 

i E P d 
and the first integral may be expanded in the form of an infinite series and integrat? 
term by term giving 


o (2j-2)8j—0 (2—3)... .. 
an 5, 8 OG To aN 
ve 


r n 
The other factor may be written as [3(1 +40 —27))], and a series expansion for this has be? 
given in I. The two series may then be multiplied giving finally 


ee ree ee 
[oe [5*8 m eee] aay, 
346; : - 


0 


G. M. JENKINS 193 


ànd by integration i 
y egration it may be shown that & (P7175) is the same as the expression given by 
large number of the simpler trivariate 


(5-12 : 
OR MR E manner it is possible to show that a 
s of the constructed distributi i i 
‘ s stribution agree with t d ^ 
characteristic a g h those derived from the smoothed 
he pr à 3 : 
Bei for constructing p,(v,) is similar i 
ated equations for the moments. It has been v 


n principle but leads to much more 
erified that these are the moments of 


the Ki 2 distribution. 
n eee sampling experiments have been conducted on artificial series which 
has Elo ne fact that the distributions of the v; alternate in the case of random series. 
the A.r. E en shown that the distribution theory provides a basis for O-diserimination m 
2 upwards "is For example, in the case of schemes of orders 1, 2 and 3 using samples of 
he case of vie Soir been shown that the number of correct classifications is high except in 
rt series with very strong autocorrelation. It is conjectured that this effect 


May be explai 

Dr ved by the fact that G(v,) in an A-R- (k—1) depends On 4,3... Xr- tO 

in a E : is not difficult to show that p.(*r LE ayı) is the same as the distribution of v, 
ndom scheme, so that it appears that the smoothed distributions are inadequate 


Whe. 
M2 is fa; E Rm 
is fairly small and a, ... 2-1 81€ close to their limiting values. 


EMS ON NON-NULL DISTRIBUTIONS 


about the derivation of non-null dis- 
functions at all since they do 


6. SOME GENERAL THEOR 
terature 


Ti 
here s 
ere se, zo : 
ems to be some confusion in the li : 
d are not density 


Yibuti 
ot a a number of those put forwar 
fits e out to unity. . 
Xact e theorem in this work is th 
rive Ted unen only. In this section it is pro 
tial correl istributions, viz. the exact and sets 
à approxi "im porc ess d "is Lus xon-eircular statistics the th 
‘Stributi imation to the exact joint distribution D I b e Seb 
arkoff ion with serials corrected for the mean is of particu » ae . In the:case of the 
Otrecte b diri. it leads to 2 smoothed distributio for the first lag serial correlation 
for the mean, analogous to Leipnik’s distribution. E 
iances by 6j = 353 2 Euge 


1 
adow (1945), but this applies to 


ly a unified method in order to 
joint distributions of circular 
1. In view of its practical importance as 


at due to M 
posed to app 
thed non-null 


8 


e 


nce and serial covart 


Def i 
1 Ning the modified standardized varia 
it may be shown that the 


W; 
iint Site expressions ĉj when there is a mean correction, 
aracteristic function of these statistics In the A.R. (k) scheme 
MS hs oc NET o d (a= — 1) (6-1) 
Written in the form : - F , 
$9 (85... 04) = || HL. [ehem tt p C08 = | i - 
ont PM 
EE E (6-3) 


1-1 
Al -n (eh cos = 


ang 
86, ...0 = 1 
gs 


ae) ent i; 
j -Sg : 
.,k) and dy) = Maj. |J, | is the 


= 1,2;.- 


6-1) and is given by the circulant 


k , 
2230/95 ü 
to the zs in ( 


— apoj} (6:4) 


Te, 
Speoti 
ct n 
ively, where b; = a;— i0;. % 


* 


copia. i=l 
n of the transformation from the 2's 
) i m 
| Fe) = 10 =a MF 7 
Biom, 43 


13 


194 Tests of hypotheses in the linear autoregressive model. II 


It is necessary now to derive the smoothed forms of these functions. The smoothing process 
will be defined formally in the following manner: 
If 


2 
lim — 5,108 $o; 2243 Op, 04, ..., hp, m) = L(0,, «+5 pes o, N28) 


then the smoothed characteristic function will be defined by 


$s(Ós. ...,0,) = exp Lin, (6:8) 
Since this procedure is rather arbitrary, it will be necessary to show that this functio? 
satisfies the relationship $5(0, 0, ...,0) = 1. (6-6) 


It is obvious that it satisfies all the other necessary conditions for being a charaoteristi? 
function which have been given by Kendall (1948), since (6-2) satisfies these conditions a? 
the latter are unaffected by smoothing. In order that (6-5) is to represent a true characteristi 
function, it must satisfy the further condition that, when inverted, it yields a positiv? 
function. 


The following theorem will now be stated and its proof given in outline. 


k e 
TuxonEM. Provided that o4, cto, ..., %, are such that (6-1) generates a stationary process, t 
smoothed characteristic function corresponding to (6:2) is given by 
22 27 
dif isin Gel temp [-zf log {by +b, cos r+... +b, cos pay], ey 


and it satisfies condition (6-6). the 
The proof resolves itself into a consideration of the separate limits of the Jacobian and 


-nation 
finite product in (6-2). Using the following lemma, it may be shown that the contributi? 
due to the Jacobian is zero. 


Lemma. Provided that the Cy, 
| J| + las n —co. 


Proof. This follows by observing that | J, | may be written in the form 
(1— Az) 0 — A2)... (1 — Ap), 
where 4,, 4,, ..., A; are the roots of the equation 


P 88 
05 are such that (6:1) generates a stationary 200^ 
(6:9) 


g*—o gk — Ley = 0. n 
From a general theorem due to Wold (1938), it is known that the roots of this equation ha 
moduli less than unity so that each term in (6:8) tends to 1 as m->00, ds te 
The proof of the theorem is completed by showing that the product term in (6-2) ten ng? 
the expression (6-7), and the latter may be shown to satisfy (6-6) by j ustifying an interch& 
of limit operations. 
Since 


(1-0 a — ... —a,)2 
e 
it follows that the same relationship holds for the corresponding smoothed functions. WA 
now proceed to express the non-null distributions corresponding to these characteris 
functions in terms of their null distributions. For example, Fourier inversion of (6 


yields 


qa rr 2-209310: 5... — 40,11 
Bo- 0) = 0.0 |! Oly — ts 94)? — iĝo — i0, Aj. 


= l. T © etg 7 je 
(Co. weep Cy |) Fe mal, zr eži p 95569) TE d6, 


[ 
| 
| 
| 
| 
| 


T 
he hull hypothesis distribu 


G. M. JENKINS 195 


whilst the tr: 

he transformation A; = (a; — 10;) results in 
k 

D(Co +++ C | &;) = exp E X a |J | P(o wey Op | = 0). 
=0 


It is then possible 


(i) to 
(ii) eat ies to new variables Co; 74; -++ "k> 
and e the fact that c, is independently distributed of p(ry re: ---: 


ry) When the a; = 0, 


(iii) to 
integrate with respect to c, between the limits 0 and oo. 


In th 
€ case of (6-2) and (6:3) this leads to 


J. 
Brig sr] TET d | EX ses Th | Ce = 9). (6-9) 
0 q^ Y» diia 
ME AC a * 
|J. =a D^ qmuessmla 0. (610) 


Pln Fa Fpl) = 
(s Fo, Pe 8) = Fag cas recti 


Tespecti 

tively, where the null hypothesis distributions 
and (fs wep Fe | % 9) 

eneral case by Watson (1956). It is to 


distributions for an A.R. (k) scheme 


have been of plru orela = 

S aai T by Quenouille (1949) and for a more g 

Blven 5 at (6-9) and (6-10) are different from the 

Th cl Quenouille (1949). 

e it is p of the smoothed distributions. 

ya le to prove that py (c9) = plCo Pees 

$ aracteri modification of a theorem due to Pi aos 
Contin: istic vectors of the quadratic forms in the 

uous set in the same manner 25 

© above method then leads to the following smoot 


cedure is identical except that in this 
ndent of p,(71; To: +++» ry). This follows 
(1937), replacing the discrete set of 
on of the serial correlations by 


, the pro 
) is indepe 


Koopmans (1 942). 
hed distributions: 


1 
: ns piro ted % = 0) 611 
bít srl s aE p ( ) 
2 
(1-247729 F. Fy | Qi = 

f PsP nees Fh | o) = me ces i 7, | &; 0). (6:12) 

6. $ E 2 a 3 
1D is the generalization of Leipnik's distribution, whilst (6:12) is new and corresponds 
the next section we shall be concerned with (6-11) 


Oe 5 
Quation (9-14) given by Daniels. In 


T) 
(6:12) when k = 1. 
ONS ON THE MARKOFF SCHEME 


ULL DISTRIBUTI 
d (6:12) reduce to 


(6:10), (611) am 
= 0) [1 +a- 224] ^, (7-1) 


h 7. NON-N 
er 
N A.R, (1), equations (6:9): 
ge. e EU je 
prl) = (cz) přl% = 0) [1 -- 61 — 22, 7,] 3-2, (7-2) 


ania) = sl 7 o) cot n ns (73) 
= =0 = 
p.i a, = 9) [+i 22,7] 30-5. (7-4) 


tions for (7-1) and (7-2) have been given by R. L. Anderson 
distributions aT of no use practically since, in addition to being 
13-2 


a 
942), but these exact 


196 Tests of hypotheses in the linear autoregressive model. II 


very complicated, they deviate considerably from the distributions of the non-circular 
statistics. This observation has already been made by Kendall ( 1954), who showed that 
whilst the expectation of the circular first lag serial correlation is given by 


be 1 j ] 
ó (File. = a —— 0-42)«0[, CR 
n n? 
the corresponding formula for the non-cireular coefficient is given by 
= 1 ` T 
Eae = t; 0a) eo (1). Ux 
n n? 


It will be shown in this section that 6 .(71)., agrees to the required order with (7-6), and this 
illustrates the agreement to O(n) between the smoothed circular and exact non-circulat 
distributions. 

(7:3) is the distribution first put forward by Leipnik (1947), and (7-4) is an analogous 
distribution with the serial correlation corrected for the mean—this agrees with the saddle- 
point approximation given by Daniels as equation (5-4). 

The form for p,(7, | «, = 0) has been given in $4, and itis to be observed that this function 
has a negative loop in the positive tail so that it is not a density function although iM 
integrates out to unity. For moderate n, however, it behaves like a type 2 distribution 

We now proceed to show that the moments of (7.4) may be expressed as functions of the 
moments of the Leipnik distribution. Tt is necessary first of all to derive the first fou" 
moments of the latter distribution. The first two moments were derived by Leipnik himsel 
by a very complicated method starting from Pelri, Co | œ). A simpler iebhod is require in 
order to derive the higher moments of Ps(r1 | e) and, also, this must be capable of extension 
to cover the new case p,(7, | o). 

This is obtained by expanding the distribution in the form of an infinite series in the 
following manner: 

bins |) = pu [os = 0) Latin $ (mtj)! 
j-o (àn—1)1j1 ^? 
where A = 2æ,/(1 +2), so that |A| <1, and hence 


t 1. lyk = 2)—hn v Gn + 2j — 1)1 jt 77) 
Mar = é(ri* | a5) = (1 -- o4) = (n — 1)t (27)! Ansa, f 


E po uad. 
maja = E (ri | oy) = (1 +a) A an- D À 


P ai yore. 
By substituting for the null moments, viz. those of the type æ distribution, it is possib i 
to derive the non-null moments by manipulation of the resulting infinite series and se 
the basic identity m, = 1. The details are omitted here and the results quoted in the followins 


form: 


; Tot 
Ua — ag 

; 1 n(n--1)oi 
me = S2 * (n4-2) (n-- 4) 

n 3nc, n(n +1) a4 
PA 7. (+2) (na 4) * (+4) (n +6)’ 


6n(n + 1) a "(n 4-1) (n4- 3) o 


, 3 = ÓÓ—ÀÁÀMÓ—— "PR: Le 
m= (32) (n4-4) ! (n 2) (n --4) (n 4-6) (4-4) (n.4- 6) (n +8) 


G. M. JENKINS 197 


These 
lead to central moments in the form 


My = EN RR n(n —2)ai | 
"2| — (n+2)(n+H)I’ 

ma = — nax 2n(n—2) (3n—2) 24 
(m+ 22 (n 4-4) * (n 4-29 4-4) (0+6) 

m, = 2-3 Gran? —8n—4). _ 9n2i(n—2) (n3 — 1422 + 12n — 8) 
(n-2)(n--4) - (n4-29 (2 +4) (2+6) (n4 2) (n4-4) (n 6) (n+ 8) 


In 
rm ofan infinite series and the moments 


asimilar . 
lar manner, (7-4) may be expanded in the fo 


€Xpre, : 
Ssed in the form è EEEN 
—-- pap 2 n +j- DU (7-9) 
Whe % = Ta) ;j-"— gu 
NEAR . T Rm " 
H, are the moments of p,(7, | % = 0) andare given by re = (4-2). Using t$ M wegen 
ased on n’ = n — 1 instead o 


T , 
in’) to : 
denote the ith moment about zero © 


2 Obser š 
rvations, it follows that (5:1) and (5-2) may 


f p,(r, | %1) but 
be written in the form 


Tiy = Hoke 
iat e (7-10) 
ZE (s) fax. 


We 

1 m 
now proceed to prove the following identity: 
1 tin’ nl t Y i T 

m, = qaot Jis = i) m; ) (7-11) 


he two cases where i is odd or even. Writing i = 2k 


It i 
in [an ay to distinguish between t 
), it follows that A 
j o (1v -2j-2) 
=) Latin’ (© (4n'—14 2)! asia son S um I 
mi. (rots. de zi Ren +3 qy-116j- 0) 


TERA 
ASI Tjok- f> 


and q-«) Um" d (7-8) and using (7:9) and (7 
Substituting for the expressions given by (7-7) and (7:8) and using (7-9) and (7-10), we 
Obtain (7-11) T ó £f Ms odd is similar. Since the values of m; are known fori = 1,2,3 
‘ . The proof for * i diens moni 
i a it follows that 77; may be derived fort = 1, 2,3- Tf the necessary substitution is made 
11), it may be shown that i 
1 Ana 792) ie ; (7:12) 
m =a- (124-325) +21) (n+ 30-a) (m-—1)(n+3) = 
(nume 
9 16n a 
2na,(1 + 244) _ = 1 " 
mW 1 dh. a erry 1 3 map (7:13) 
We. _ + cr 3) (n1) )m45) (1—« 
me ^I NES T (n? 1) (3) A 1) 
n 4 tia 
35- 3i mM (n3) (n 8) Sed 
3 —10)(n 
(n 3n(n—3) B bod n(n-4- 1) 
7 (n2—1) (m+ 3)1—e, l-a (n+3)(n+5)(n +7) (7-14) 
ic ex ions for rari ; 
These lead to the following asymptotic B Or ci and variance: 
1 = 1 Sai ï 
ms reU n? ( atr i)o) (7-15) 
1 
1 2 3 1 
Z2 z2-(1-—9031)t;s — 3-r 4o4 4- 14a = E 
Me ~ n g ai) jit j ai} «o(r;). (1:16) 


198 Tests of hypotheses in the linear autoregressive model. II 


The above results are interesting in that they show that 

(i) m; agrees up to O(n—) with the first moment of the non-circular statistic independently 
given by Kendall (1954) and Marriot & Pope (1954). 

(ii) The term in O(n-?) makes a large contribution to the variance even for moderate 4 
(it may be seen that the contribution is positive if æ, > 0-26). This is in agreement with the 
conclusions of Marriot & Pope (1954) who conducted sa 
A.R. (1) series and found that the expected variances to 
the observed variances. The variance of the non-circular 
that we suggest that formula (7:16) be used as an estima 


mpling experiments with ar A 
O(n-) were significantly less T 
statistic is not known to O(n~) 


te. 
Table 1 
T 4 (var.) 4 (var.) | 2 
[9] P 11 
Source a | ^ k* e to io (0 P) P3 
O (n7) O(n-) | 
baai —— i] 
Marriot & Pope 0-4 20 60 0-0529 0-0420 0-0466 | 0-09 | ost 
(1954) 04 | 40 | 40 | 00398 | 0-0210 0-0224 | 0.0007 ot 
ref oo 20 | toas Dongs | Gore | ors us 
cs | 2 | 8 | GBHS | Diss | uisa oeoocoil HC 
0-8 40 40 0-0156 0-0090 0-0154 0-003 Hs 
0-8 60 | 30 0-0141 0-0060 0:0088 |<0-00001 uc 
l: | = =| 
| -91 
Rao & Som 0:8 15 50 0-0509 0:0240 | 0-0696 < 0-00001 re 
(1951) 0-8 35 25 0:0212 0-0103 0-0186 0-002 
| 
.20 
Kendall (1949), 0-7 22 40 0-0462 0-0232 0-0390 0-0002 0-2 
Series 9 
(extended) 
* k =number of subsets on which the variances are based. . 
-2) is 


in O(n 
Table 1 shows the improvement in the estimates of variance when the term in PN 


n j » : j 
taken into account. The observed variances of 7, in the first six series are those £ latin£ 


; ane u 

Marriot & Pope (1954, Tables 9 and 14). The investigation was supplemented by calc 

the variance of. 7, for the following series: i ‘lations 
(2) Two Markoff series with 04 2 0-8 given by Rao & Som (195 us serial correla 

being based on 50 sets of 15 items and 25 sets of 35 items respective y. d from 
(b) 40 sets of 22 items from a series given by Kendall (1949) which was extende 

500 to 1000 terms by the author. NN n 3 due 
In all cases the me correlations were non-circular but varied 5 gemia igy the 

to the use of a pooled or unpooled variance in the err t jo e CY in two 

agreement between Observed and expected variances to O(n-?) is goo excep 

cases. ' 1 ion is 

4 the Yule scheme where the situatio. 

The analysis of thi i been extended to m 

more roe Dela ene will be published at a later stage. The same effect b 


G. M. JENKINS 199 


been 
Obser š 
served here as with the Markoff scheme, viz. that estimates of variances and 


Covari, 
ance. ET " E 
s to O(n-!) are inadequate for autocorrelated series. 


nks to the Department of Scientific and 


Ine . 
onclusion, I would like to record my tha 
the period of this research. 


ndustri 
rial T s z 
Research for a maintenance grant during 


REFERENCES 
Anpersoy, R. L. (1942). Ann. Math. Statist. 13, 1. 
Danrers, H. E. (1956). Biometrika, 43, 169. 
Drxox, W. J, (1944). Ann. Math. Statist 15, 119. 
Jexkixs, Q, M. (1954). Biometrika, 4l, 405. 
KENDALL, M. G. (1948). Advanced Theory of Statistics, 1. London: 

Charles Griffin and Co. 

KxxpALL, M. G. (1949). Biometrika, 36, 267. 
KENDALL, M. G. (1954). Biometrika, 41, 403. 
Koopmans, T. L. (1942). Ann. Math. Statist. 13, 14. 
Lerentk, R. L. (1947). Ann. Math. Statist. 18, 80- 
Mapow, W. G. (1945). Ann. Math. Statist. 16, 08. 
Marriot, F. H. C. & POPE, T Ant ; netrika, Al; 390. 
PITMAN, E. J. G. (1997). Proc. Camb. Phil. Soc. 38, 212. 
Quzwoviurm, M.H. (1047): J+ i 1 
Qurnourte, M. H. Sad): Ann. Math. Statist. 19, 561. 
Rao, S. R. & Som, K. S. (1951). Sankhyā, 11, 2 
Rupra, A. (1952). Biometrika, 39, 5 
Rupna, A. err Calcutta. Statist. Ass: Bull. 5 (18), 59- 
Rupra, A. (1955). Sankhyd, 15, 9. 
Warson, Œ. S. (1956). Biometrika, 43, 161. 


Warre, P. (1952). Biomeiribs na of Stationary Tema Series. 


Worp, H. (1938). 4 Study inth 
Uppsala: Almquist and Wicksell. 


[ 200 ] 


MISCELLANEA 


A class of distributions for which the maximum-likelihood estimator is unbiased 
and of minimum variance for all sample sizes 


By D. E. BARTON 


University College London 


1. We consider a sample (2, ..., a 7,) randomly and independently drawn from a population whose 
law p(x; 0) depends on an unknown parameter @ in such a manner as to admit a sufficient statistic for "^ 


Blackwell (1947) noted that if ¢’ is an unbiased (but possibly inefficient) estimator of 0, then, iftis 
a statistic sufficient for 0, 1) 
T — é(t|2) ( 
is purely a function of ¢ (and therefore sufficient for 0) and 
&(T) — 9. (2) 
Further, since Vart = Var T - & [Var (t 10] (3) 


he noted that T has a variance not greater than that of an 

derived. Hoel (1951) showed that a function of T, sufficien; 

and so we may talk of the best unbiased estimator. 
Since (as is typified by the example Blackwell gives 


y unbiased estimator from which it may v 
t for 0 and obeying (2), is effectively uniqu^ 


) it is generally easy to find an unbiased estimator 
of and the method of maximum likelihood yields a sufficient Statistic, 6 say, for 0 in these circum: 


stances, it would seem that it is always possible to find the best unbiased estimator of 0. Unfortunately’ 
&(t’ |0) may be a function which it is not feasible to evaluate. For inst: : 
the sample arithmetic mean, z, is unbiased whilst the sample geometric mean, ø is s ficient for 9 

it is not possible to evaluate the integral for 6( |g). The present note remarks il, ey up 5 ]wavá possible 
to choose a function ¢ = $(0) for which the maximum-likelihood estimate is the b a Nt : estimator 
The method is extended to the multi-parameter case. S the best unbiased e! 


ance, in the example of equation (UU t 


2. Koopman (1936) and Pitman (1936) showed that the most general form of probability law admitting 
a sufficient statistic is Ly 


V(x; 0) = exp (a(0)/(») + B(0) + gy, (4) 
where a, J, f, g are functions of the stated variables. The mean value of the log-likelihood function iS thus 
L = a(0)-- BO) +9, (5) 
n = n s 
where f= Efe), F= È glm. (6) 
i=1 i=1 
L T E 
Tienes & = v0)7- BO), (1) 
60 
L 7 
and the maximum -likelihood equation is 
oL 4 r 
=) aad fap 9 
0= lo- «'(0) 4- 20), (9) 
Thus if ġ = $(9) = —f'(0)/a'(0), (10) 
then &(4(8)) = $0). a 


Expressing p(x; 0) as p(x; $) by means of (11) we have for the maximum-likelihood estimator 


4($) = $. 


Miscellanea 201 


Furt} 
her, the for 
m of e es es 
of p(x; à) is such that the equality conditions for the Cramer-Rao inequalit, hold 
ity ho! 


and so 
us NN e -1 
arg = aes log p(: 2 = [n4'(9)] ^. (12) 
nction of à by (11). Thus if 


ax? 
pla; 0) = TOs) z (x20) (13) 


Where 
e A(d) is 
(9) is the function æ expressed as à fur 


(0) is 
5 the di 
he digamma function of 0 and 


F dlogI(0-1) _ 
the i pT a i 
n 
x 
ó- zlog (2122+: Pa) (15) 
(16) 


and 
g, Varó =—F(0)/n- 


POE 


Tt fi 
ollow; 
3 from the 
he well-known theorem on su nly the variance but also the 


fficient estimators that not 0! 
G.F. of à. 


Chara, 
Cteristi : ^ 
Stic function of ĝ is known. Thus if y(t) is the ©- 
it 

pizi m 7 

Ds E y(t) = isl 4 + J (17) 
instance, the shape constants for é are 
"TE -—— 
" n=- RAF Ta, apt ui 
lero 
NEZ 
ne (19) 


aw depending on p unknown 


e most general case ofal 
d the form 


3. g 
* Koo 
pm; H 
an and Pitman showed that th 
tisties ha 


ra; 


Meters 
8 (0,,...,0,) admitting sufficient sta 
p 
Wh, plas 8) = exp £ sigo pene, (20) 
Cre Qa r= 
enotes (0,, ..., 0p). It follows as before that 
2 P 
0 = È arl) EG) +B) (s= LP) (21) 
and r=1 
oc Salted) (07 Les) - 
Wher r=1 
i a Lad BUOY ag, FO” 
anq ĝ ds a (0) = E a(b) Ps 20. ` (23) 
Sos Pas E Hence, if 
“here 4 à, = $007 SpA reta np) dib 
= [[o5,1| and B, is A with f. replacing ai, in the rth row: then (23) becomes 
“nd (29) ; fade (r= dyn P) 3 
» " gji =) sap a 
ws from Cramer 9 (1946) generalized inequality, namely, that, in 
ide that of any other p estimators of 


Best 
hi Minimu: n 
m variance property follov 


t 
p, tmin We Be S 
dL. ology, the concentration elli sop Po lies ins! 


psoid of $r- 


casily seen to be 
1F2x] E oer, 
Lae = FL ae (27) 


> 
he. 
Vari 
a i i i 
nee-covariance matrix of the {Pr} is 


202 Miscellanea 


where éz,/@¢, denotes the partial differential of æ, (expressed as a function of dy, ...,,) with respect to [72 
The right-hand side of (21) gives a simpler form since it does not require matrix inversion. 


This follows also from the generalization of (17); namely 
OV _ nn it; (28) 
at, = ig? (s (11m) 


where i is the joint c.c.rF. of (ó3 in terms of dummy variables (/,j and ġž(æ;) denotes the value of 
É- when the variables (0;) are written as functions of (2). 


+. For example, the Pearson type III law of known origin 


ab: e720: 
10,0, = 0157 o< 
Ples ba) TTT (O<a), 


if $1 = (02+1)/0, Øz = F(0.) —log 0, 
yields explicit unbiased maximum -likelihood estimators of $,,¢, with variance-covariance matrix 


n 1/0, 
T 1/0, Fa 


The type I law P(x; 0,,0,) = (1—2)5a^:[B(0-1,0,--1) (0xx« 1) 
gives $i = F(0)) — F(042-0,--1), $2 = F(03) — F(0,-- 0, -- 1), 
with the variance-covariance matrix 

1[ F(0,) — F(0, 4-0, +1) — F(0-- 0, - 1) 

n — F(05 - 0, - 1) F(03) — F(0, -- 0, -- 1) |" 


: sed 
It should be noted that these results do not apply to the multi-parameter case where interest 15 E a 
U 


on one of the (0,) say 0,, and the variables are so integrated out that we are left with the distribt ja 
one sufficient statistic dependent solely on @,. For instance, when the variables (x;) are the bir ho 
observations (y;,2;) from the five-parameter normal surface and we derive the marginal law id wields 
sample correlation coefficient) which is dependent solely on p (its population value), the method Y 


é E _n-—2 p 
lacs] | n-34(1— p)" 


s : atio” 
i.e. this relation results from the analogous equations to (8) and (9). It is essentially different E 
(11) though it has interesting similarity. 'This is a particular case of a general result to be pu 
B. I. Harley. 


M. d cien. 
5. In conclusion, it may be said that this note gives a method for obtaining a best unbiased D 
statistic in cases where Blackwell's method does not yield tractable results. It may only be reas NIS 
applied when the property of unbiasedness is of more importance than the functional form 
parameter estimated. 


REFERENCES 


BLACKWELL, D. (1947). Ann. Math. Statist. 18, 105. 
CnAMER, H. (1946). Skand. AktuarTidskr. 29, 85. 

Horn, P. G. (1951). Ann. Math. Statist. 22, 299. 
Koorman, B. O. (1936). Trans. Amer. Math. Soc. 39, 399. 
Prrman, E. J. G. (1936). Proc. Camb. Phil. Soc. 37, 567. 


Miscellanea. 


Further critical values for the two-means problem 


By W. H. TRICKETT, B. L. WELCH AND G. S. JAMES 


University of Leeds 


values of a quantity, v. used in comparing 
whose accuracy depends in à certain 
produced as Table 11 of Pearson & 
i.e. the one-sided 


Tables have 
wo E et been given of one-sided 5 and 1 o% critical 
Way on two nd more generally used in & wide class of comparisons 
Hartley (19 agen separately estimated (A. A; Aspin (1949), re 
*5 and 0.5 o )). We now present tables of two-sided 5 and 1% critical values of v, 
Ifyisa 7o values.* 
Where A. qur distributed estimate of & population parameter 7] with sampling variance Ao? + A05, 
the standard Az are known positive constants, and if s? and s3 are estimates of oj and c distributed in 
en we d, fashion with v, and v, degrees of freedom respectively, andif y, stand sare all independent 
efine : z a 
(y -midA si + Aas): 


v= 
known ratio of of to gs, but, writing 
[e Ausi/Qu sies 

fthe observed st 


v has 
a distri n 
istribution depending on the un 


and sh with the property 


We m 
Uy seek : 
that ek a function V(e; Vp Vo 2)» depending on the ratio O 


Pr {v> Vle; n e 9) = 


f such & function 
)The formulation of the 


Whey 

egi : i 

Make ^s is a prescribed probability. ( would permit us, for instance, to 

peste toe onus statements about 7 in problem in these terms 

Welch ( 1947) 

xa rent i jous table: ]culated by Mrs Aspin, the object has been to ive 
es of ables, and in the previous tables ca D I give 

Blow Ves viv i s. The methods of computation described by Welch do not 

t 1» Va, &) to two decimal places. uc sap 0028 wo ot 


his ti 

v v, 2^ £0 be done for v 1l values of ?: and », In the present tab : 

oubt. 8, and for a = hoe phe 4 4 ; is limitation, in a few instances there is 
= š e Vy 


Bbo v> 10. Eve 
ut the second decimal place: 


Knowledge © 
the usual manner. 


We are, in f 


Tren, 
lire 1 of errors up to 1 unit in the last figure» a 
eS exe, nterpolation is necessary: direct linear interpolat 
e a in any panel which is bordered by V OF e = 0. 
Oran z 
^ 84, P. ES of the use of the ti 


ables reference should be made to Aspin ( 1949) or Pearson & Hartley 


So 

m i s wer i 
M © of the c ; d . c and checking the present tables were carried out on the 
aset ester maio e S ic computer. Je would like to acknowledge the 
of 1 t2NCe of Dr D = y M LN Miss D. E- Pilling end Mr Jodo DODO ECs o£ the Department 
th; Organi . W. J. Crul , peak :tv, in helpin| one of us (6.5. .) to make usi 

his Sas od Structural Chemistry: Leeds University 1n ping ase of 


Aspny, A REFERENCE „racy involves two variances, se 
CODD aeta TO UIT companions marin S , 290-6. Reprinted Aki 
Py, t Pm [ E j Biometrika Tables for Statisticians, 
niversity Pr a are 3 
agers Conn EI MA 
ro involved. Biometrika, 4, 28-39. 


1. Cambridge: The 


n several different population 


m as the earlier tables, but their descriptive 


* 
ly the same fori 


xm 
piis: now tables are arranged in precise 


as been slightly modified. 


204 


Upper 24% critical values of v 


Miscellanea. 


* 
= (y= M/V (Ay si + Ass) (i.e. upper 5% critical values of | v p 


-—— | oo | oa | ae 03 | 04 | 05 | 06 | 07 | 08 | 09 
(Ay sj As5$) | 
a y, 
E 8 2.20 | 2-14 2-10 | 2-14 | 2-20 | 2:25 
10 2.20 | 2-15 2-08 | 2-11 | 2.14 | 2:19 
12 2-20 | 2-15 2-07 | 2-08 | 2-11 | 214 
15 2-20 | 215 2-05 | 2-06 | 208 | 210 
| 20 2-20 | 2-15 2-04 | 2-04 | 205 | 2:07 
p 2-20 | 2.14 2-01 | 1-99 | 1-97 | 1:96 
10 8 2-14 | 2-11 2-10 | 2345 | 2-20 | 2:25 
10 2-44 | 2-11 2-08 | 2-11 | 2-14 | 2:18 
12 214 | 2-10 | 2-06 | 2-08 | 2-11 | 214 
15 2-14 | 2-10 2-05 | 2.06 | 2-08 | 210 
| 20 214 | 2-10 2-04 | 2-04 | 2.05 | 2-06 
| € 214 | 2-10 2-00 | 1-98 | 1-97 | 1:96 
12 8 2-18 | 214 | 211 | 208 | 2-07 | 2:97 | 2.19 2.45 | 2.20 | 2:25 
10 2-18 | 2-14 | 2-11 | 2-08 | 2-06 | 2.96 2-07 | 2-10 | 2-14 | 2:18 
12 2-18 | 2-14 | 2-11 2-08 | 2-06 | 2.05 2-06 | 2-08 | 2-11 2:14 
15 2-18 | 2-14 | 2-11 | 2-08 | 2-06 | 2-04 | 9.04 2-06 | 208 | 2:10 
| 20 2-18 | 2-14 | 2-11 | 2-08 | 2-05 | 2.04 2-03 | 2.03 | 2.05 2-06 
| "i 218 | 214 | 211 | 207 | 2-04 | 202 | 1-99 | 1.98 | 1.97 | 1-96 
15 8 2137 2-10 | S08 | 2-06) | 19:06. | zor | ong | sing | mem 2328 
10 213 | 210 | 208 | 2-06 | 205 | 2.05 | 2.07 | 2:10 | 214 | 2:18 
12 2:13 | 2-10 | 2-08 | 2-06 | 2-04 | 2.04 2-06 | 2-08 | 2.11 | 2:14 
15 243 | 239 | 208 | 2:05 |.2:04 | 203: zoa | ons | 2.03 | 240 
20 233 | 210 | 208 | 2-05 | 204 | 2-03 | 9.03 | 2.03 | 2.05 | 2-06 
ool 2-13 | 2-10 | 2-07 | 2-05 | 2-02 | 2-00 1-99 | 1.97 | 1-97 | 1-96 
20 8 2-09 | 2-07 | 2-05 | 2-04 | 204 | 2-06 | 210 | 215 | 9.90 | 2:25 
| 10 2-09 | 2-06 | 205 | 204 | 2-04 | 2-05 | 9.07 | 2-10 | 214 | 2-18 
| 12 2-09 | 2-06 | 2-05 | 203 | 203 | 2-04 | 9.05 | 208 | 211 | 2-14 
| 15 2-09 | 2-06 | 2-05 | 2-03 | 2-03 | 2.03 2-04 | 2.05 | 2.08 | 2-10 
20 2-09 | 2-06 | 2-05 | 2-03 | 2-02 | 2-02 | 9.99 2-03 | 2.05 | 2-06 
| eo | 2:09 | 2-06 | 2-04 | 2-02 | 2-01 | 1-99 | 1.98 1-97 | 1-96 | 1:96 
| -96 | 196 | 1-97 | 1-99 | 201 | 205 | 2-09 | ».14 | 9.99 | 2:25 
di 2 D 1-96 | 1-97 | 1-98 | 200 | 203 | 2.06 | 2.10 | 2.14 | 218 
12 1-96 | 1-96 | 1-97 | L98 | 1:99 | 2-02 | 2:94 | 9.97 | 211 | 214 
15 1-96 | 1-96 | 1-97 | 1-97 | 1:99 | 2-00 | 2.02 | 2.05 | 207 | 210 
20 1-96 | 196 | 1-96 | 1-97 | 1-98 | 1-99 | 2.01 | 9.99 | 2.04 | 2-06 
23 1-96 | 1-96 | 1-96 | 1-96 | 1-96 | 1-96 | 1-96 | 1-96 | 1.96 | 1-96 


y 


" 3 i 
istri 1 with variance 4, 71 +A, 02 and 

* gà rmally distributed about 7 w1 à : : 
f P pe baed on v, and v, degrees of freedom respectively, A 
li i the problem of comparing the means of samples taken fr 


1), vj = (n5 — 1) A, = 1n; and A, = ln, 


[EZ 


Ta), 9 = (m, 


10 


i ;timat 
51 and s? are independent ut 
; and A, are known constan i3 p t 
rom two normal population , 


jze?* 
where n, and n, are the sample 5 


Miscellanea 205 
Upper 3% critical values of v = (y -mI Asi Asse) (i.e. upper 1% critical values of |v|)* 
| | | | | 
. Asi | X | 
| Qs st Asl) 00 | 01 | E > 0-5 0-6 | 0-7 0-8 0-9 1-0 
-——__ | I 
i I— - | | | | 
Va | 
10 Vy | 
10 | 3-17 | 3:08 .79 | 2-82 | 290 | 3:00 | 3:08 | 3-17 
12 3-17 | 3-08 73 | 2-79 | 284 | 291 | 298 | 3:05 
15 3-17 | 308 77 | 276 | 278 | 2:83 | 2-89 | 2-95 
20 317 | 308 | 76 | 2:73 | 2:74 | 2/76 | 2:80 | 2:85 
30 3:17 | 3:08 | 235 | 271 | 269 | 270 | 272 | 275 
[rs] 317 3-08 | 274 | 2:07 | 2.63 | 2-60 | 2-58 | 2:58 
| | I 
12 | | s a; Um 
P cord] s | am | aai za | 2:00) ) 39:08 | 1950 
5 | 298 | 291 284 re sc s »59.| 3.95 
15 5 | 2 2. 9-94 | 278 | 2°75 2-75 | 278 | 2:83 2.89 | 2-95 
3:05 | 2-98 | 291 | 7 BU | 272 | 2:73 | 276 | 280 | 2:85 
20 3-05 | 298 | 291 | 284 2-78 274 o ete pd phe es 
30 3:05 | 2.98 | 2-91 284 | 2-77 213 n D 2 2.59 9.58 | 2:58 
co & | odg | Bs 284 | 277 | 271 9:65 | 2:02 | 2:59 | 208 | 99 
3-05 | 298 | 4 91 | 4 
15 | | omg 277| 282 | 291 | 3:00 308 | 3-17 
10 | 295 | 2:89 | 2:83 | a s d d 2.84 | 29] | 298 | 3-05 
12 2.95 | 289 | 283 | 278 | sin 2.3 | 274 | 2/78 | 2:83 2.89 | 2:05 
15 2.95 | 289 | 283 | 278 | 274 | ogy | 271 | 273 | 278 | 2.80 | 2-85 
20 295 | 289 | 283 | 278 | 373 2-0 | 268 | 268 | 270 | 272 | 2775 
30 295 | 280 | 283 | 278 | 273 | gg; | 264 | 261 | 259 | 2:58 2-58 
eo 2.95 2:89 | 283 | ? 77 | 272 | 2 | 
20 | = ga arei] 282 | 202 |13:09 | 9:08 3-17 
10 285 | 280 | 278| 275 H^ 2 2.78 | 284 | 2-91 | 2-08 | 3-05 
12 2.85 | 2.80 | 2:76 a agl | 271 |274 | 278 2.83 | 289 | 2:95 
15 2.85 | 2:80 | 276 HC. ero 270 | 270 | 273 | 276 2.80 | 2-85 
20 285 | 280 | 276 | 273 | ooo | see | 267 | 68 | 270 | 278 2-75 
30 285 | 280 276 | 275 | ogg | 265 | 262 | 260 | 259 | 2-58 | 2-58 
co sus | 9802) 275. | oom a | j 
30 cy | E sd 2.82 | 2-91 | 3:00 | 3-08 | 3-17 
10 x8 | 232 | 270 | Sag | 0. 278] 277 | 2.84 | 291 | 2-98 | 3:08 
12 | 275 | 2-72 2-70 3.68 | 2°68 | 270 2:73 | 278 | 2:83 | 2-89 | 2:95 
15 245 | o72 | 270 | ES | ger| 208) 269 | 272 | $3 2.80 | 2-85 
20 255 | 27% | 830 | cag | n5 | $66. | $400 | 9-67 | 269 | 272 | 275 
30 2.75 | 2:72 2-69 2.06 | 264 | 262 2.60 | 2-59 | 2-58 | 2-58 | 2-58 
co 2.5 | 272 | 299 | d | | 
oes | 267 | 274 | 282 | 291 | 299 3-08 
RS 10 | 258 | 258 | 2-60 a 9-65 | 271 | 277 | 284 | 2-91 | 2-98 
12 25s | 208 | 259 Ser | 204 267|272| 277 | 288 | 2-89 | 
15 2.58 | 288 E 200 | 262 | 265 | 268 | 272 | 2-76 | 2:80 | 
20 2.58 | 2°58 Si 2.59 | 2:00 3.62 | 2-64 | 2-66 | 2-69 | 2-72 
30 2s | 259. £59 | ong | pss | 258 | 268 | 258 | 258 2-58 
| co 258 | 2:58 2 | | | 
ance Ay cit Aso; and s; and si are independent estimates 


a] with vari 1 
freedom respectively. A, and A, are known constants. 


* 

i TM t 
of 415 normally distributed pron degrees of 

is aun oè, based on "1 ating tbe means of samples taken from two normal populations, put 
y, "he problem of compa! ni mE 1/n, and Àg = l/a, where 7, and n, are the sample sizes. 


>E), n = (4-1 Ve 


206 Miscellanea 


An approximation for the symmetric, quadrivariate normal integral 


By J. A. McFADDEN 
U.S. Naval Ordnance Laboratory, Silver Spring, Maryland 


The generalized tetrachoric series for the multivariate normal integral, as given by A- C. Ait 
(unpublished), Kendall (1941, 1945) and Moran (1948), is not considered useful for computation, 9 
stated by David (1953). The purpose of this note is to sum this series approximately in one special cases 
namely, the case of four variables when all six correlation coefficients are equal. ` 1 

Suppose that t4, ta, ta and v, obey a quadrivariate normal distribution and that all the off-diagon". 
elements of the correlation matrix are equal to p. Let P,(p) be the probability that all four variables & 
simultaneously positive. Following Moran's paper, we may write 


1 8. TE (n 
E = — 4+ —sin-! Si 
4(p) = ig m p aO 
where S(p) = 3P? — 4p? + 7p! — 19p5 + 38895 — 44p* +358 p* — .... el 


jon 
The series (2) appears hopeless for computation. Proceeding formally, we make the iransformatio 


3 
p=sing (—sincj«ó«im), a 
and expand S(p) in powers of à; then 
7 2 " 4 
S(p) = 39*— 49? + 691 — 1095 + 184% — 19339? .- 33998 — (4) 


; co 

The series (4) can be summed approximately with the aid of a non-linear sequence-to-sequer 
transformation given by Shanks (1955). If Ay, A,, A», 45 and A, are the first five elements of aseque 
then Shanks’s second-order transform e, provides a new sequence, the first element of which is 


Aen = Al i) ih TAI " 
elda) =| AA, AA, AA, AA, AA, AA, |, 
AA, Ad, AA, AA, AA, AA, 
where AA, = Any An 
Table 1 
1/p P,(p) from (6) Ruben's %,(1/p) 
1 0:50121 388 0-50000 00000 
2 0-20000 802 0-20000 00000 
3 0-14973 847 0:14973 76529 
4 0-12647 941 0-12647 9249 
5 0-11301 258 011301 25446 
6 0:10422 406 0:10422 4047 
7 0-09803 672 0-09803 672 
8 0-09344. 504 0-09344. 505 
9 0-08990 265 0-08990 27 
10 0-08708 697 0-08708 71 
ll 0-08479 535 0-08479 5 
| 12 0-08289 404 0-08289 4 
i : jaco SU 
Again proceeding formally, we apply e; to the first five partial sums of the series (4) and rep 
by the first element of the transformed sequence. Then the result for the probability (1) iS 
(6) 


P,(p)= + 2 + D E) 


716 47" 4m (1-0)(1 29) 
where 9 is defined by equation (3). (Urt $) 


| 


Miscellanea 207 


Since we I ; 
have not justified tl revi > 
Table 1 gi J d the previous steps mathematically, w: 
ves is E t y. we must t st tj S ; 
Ruben ( 198 4) M reel ison of the approximation (6) with the results of the; m aie gn Ae m 
2 The E ie als n -h2,. 12. [If p = 0, equation (6) is exact.] a 
Berea ties 2 4 e lis exceptionally good except when p is near unity. ; 
plicated "imme d be obtained by the use of seven terms of equation (4) and the 3 ; d BCCULACY ES 
The p - inear transformations, as given by Shanks (1955) * TREE ofanore com: 
'OX: i nis : ^ a 
author N (6) is considerably more accurate than another one described elsewher: b: 
is no more e fis 1955), which was based on an analogy with Pólya's urn scheme; y: ara By fho 
Ifthe SUM gei than the other. : e gdikte toin (6) 
multivariate n i i i 
assi ormal integral is known for four var! s 
hown by David (1953). gr variables, 


Attem 
ts ti c P 
pts to extend the present method to n variables, with all correlation coefficients equal, and also 


to th 
e general ivari ; i 
quadrivariate case (with correlation coefficients unequal) have been unsuccessful 


the result for five follows immediately 


REFERENCES 


prit F. N. (1953). Biometrika, 40, 458- 
Ennis ML G. (1941). Biometrika, 32, 198. 
ENDALL, M. G. (1945). J. R. Statist. Soc. 108, 93. 
MoFappun, J. A. (1955). Ann. Math. Statist. 26, 478. 
Moran, P. A. P. (1948). Biometrika, 35, 203. 
Rupes, H. (1954). Biometrika, 41, 200. 
SHANKS, D. (1955). J. Math. Phys. 34, 1. (This work 
in a Naval Ordnance Laboratory Memorandum.) 


was first reported in 1949 


n-zero response in the controls 


Weighted probits allowing for 2 no 


py M. J. R. HEALY 


Rothamsted Experimental Station 


calculations can be carried out using tables of ‘weighted 


Blaci 

k (1 

robite wD has shown how standard probit 
+ These functions are defined by 


weighted minimum probit= w(Y — P12); 


Tos 


T, weighted maximum 

n P, Zisthe corresponding ordinate, — Z\(PQ) 
d snot to respond, 

rm, T 8To- Since 

table look-ups 


isher & Yates (1953). 


Where Y 


Sth gto a proportio 
th 


© Pera probit value correspondin: > i 
© appro weighting function mapel Ifrsubjects are observe 
winds an Pe weight is rw t SU and the working probit multiplied 
Oro á usually fairly small integers these a computations, 
sighted venient than those req 

‘a ted probits have certain ad 


a 
om 
tj at 
we rang ic computers. In desk computat irs 
king probit 


vantages 


e 
C ed, ve "i = the minimum wor) eiusd 
Cnsive 4 ecomi ically large for extreme V? i ? ) 
heigy Ve table of ius crm 0 Pon ie s this stage of the computation. This table is too large to be 
fro, © Store of present-di 7 chines. BY contrast, the weighted probit functions behave reasonably 
praka m ‘goof ractical im portance, and, since high-order interpolation 
USA e-i tables with comparatively few 


ig, 2 numeri 3 
erical point of view in the rang 


ont, St and si : A wi 
Ties simple process on automatic computers. ed Lf 
Coup. Can be used e " b commodated in the store of the machine withou i d 
hich can be ac computing the functions ab initio as required, 
1 and would probably 


bug -0 Possi 
P jn to avoid the use of tables 2. 

U TO; .. f i 
tha Py difesa instance the necessary TOU n mE rd aer md 
! hm d E hat the graduating function. 


a sim 
ple table look-up. A further adv® 
ansformation for each 


* 
T E : j 
Dune he author (J. A.M.) recommends the use of the iterated first-order tr 


teal value of $. 


208 Miscellanea 


can be changed from probits to, for example, s m or angles without any alteration in the man 

Y me, simply by inserting the appropriate tables. 
d [yen ae m EE tho simple mie A is required when it is necessary to allow fora donc 
response rate untreated material. Finney (1952, pp. 88-91) has shown that when this “natural zespo E. 
rate? is well determined and not too large, its effect can be allowed for by two simple adit ae 
the ordinary process. In the first place, the observed proportions responding, P, are adjusted 9 
so-called Abbot’s correction to give adjusted proportions P’ = (P— C)/(1— C), where C is the AE 
response rate; secondly, the weights have to be multiplied by a factor PUP -C[ — C). Itis of E. 
interest that these alterations can be taken care of by modifying the tables of weights and weig 
probits, thus enabling the original machine programme to be used unchanged. its, the 

It is somewhat simpler to work in terms of Normal Equivalent Deviates rather than probi "e 
addition of 5 being in any case irrelevant when working on automatie computers. 'The necessary br p 
tions to the weights, w, are taken care of by inserting a suitably modified table. The working pro?" 
is given by Y QVE Pin 
where P' is the adjusted proportion responding, Q’ = 1 — P’ and yo, y, are the minimum and maxim 
working probits. In terms of the observed proportion responding, this gives 


y = g y kG, 
Y= Got TG yy 


um 


so that the required value, nwy, is given in terms of the weighted probits by 


Thus the form of the computation remains unaltered if we use the adjusted weighted probits 
79 = (y - €m)/0 — C) 


and 7;, where the modified weights are used in computing 7, and 7, as functions of Y. ay 4, with 
Tables of w, 47 and 47, have been prepared covering the range of N.E.D.'s— 4(0-125) + Fisher 

C = 0(1) 9 %. Four-point interpolation in these tables (a convenient method is that described Sid logit 

& Yates (1953, p. 33)) gives 5-decimal accuracy over almost the whole range. Similar tables fort 

and angle function are in preparation. 


REFERENCES 
Brack, A. N. (1950). Weighted probits and their use. Biometrika, 37, 158-67. 
Finney, D. J. (1952). Probit Analysis, 2nd ed. Cambridge University Press. 


; Me 
Fisner, R. A. & Yares, F. (1953). Statistical Tables for Biological, Agricultural and 
Research, 4th ed. Edinburgh: Oliver and Boyd. 


dica! 


ons 
Treatment variances for experimental designs with serially correlated observati 


Bv J. C. BUTCHER h 
. . osea re 
Applied Mathematics Laboratory, New Zealand Department of Scientific and Industrial Reset 


1. INTRODUCTION AND SUMMARY ing 
our 


mis m : s Xghbo". ae 
Williams (1952) has considered the design of field experiments in which the fertilities of ne igh alii 


: ^ ener 
plots are assumed to follow a linear autoregressive scheme of order one or two. Here à formal g 138 
tion is made and a notation introduced which simplifies the calculation of variances of estima” 

A model of the following form is assumed: 


(D 


Up dy Uy HaTi- +... Fu My = €, 


—— i 


A Miscellanea 209 
ere the c, are i ias 
, are independent and distributed N(0, 2). For convenience this will be written 
as 

2 

2 aix = êp 
H " i- P 
it bein. E i 

g undi i 
The yield eem that ag is to be replaced by 1 when written in full. 
umed to be equal to the sum of z, and a term depending on the treatment, i.e 
(3) 


Ye = Tit bar 
tment used on the tth plot. There are c treatments. 
tion matrix factorizes in a manner 
f the estimates of treatment 


where t] 
he ; 
z aa (O) denotes the number of the trea 
(20) inh e equations are derived, and it is found that the informa! 
mplifies the calculation of the variance of the difference o: 


Const; 
ants, 
Some examples are given. 


A sq 2. NOTATION 
the eM matrix of order c all of whose elements are unity will be denoted by E. i i 
3 2 ; E. Consid 
mal +E. As.E3— cE qo ee tas y y ‘onsider matrices of 
(aI +28) (as I -- f; E) es ety y+ (04 Bot afr + cfl, f.) E- (4) 


Ifa, 7 
TEES ; 
24 + 8, E is the inverse of VEA E, then 
1 -h 
[E ond a = — t 
Aras f: a(& cfi) (5) 


ation matrix is a,I+ p, E and the variance-covariance 
ariances will all equal f. Thus the 


We 
are inte: " 
rested in the cases where the inform 
+f, and the cov 
(6) 


atrix 
[^71 
Vari 2 : ^ 
ariances m . The variances will equal 4s 
ifferences will equal 
q 2a, = 2/a,- 


d the jth treatment as 
e treatment we have 


unity if that plot receives this treat- 


Thes 


ym ji 
ment bol à is defined for the ith plot an 


eives som 
EXd-k 

j 

aj;= Daj bis (8) 


and 
zero otherwise. As each plot rec 


Als, 
0 We 
have such results as 
> 


x 
q-D-i j 

is 

xo. (9) 
i 


umb, 
er of plots receiving the jth treatment 


3. ESTIMATION OF PARAMETERS 


he f 
Te 
quency function of the €s is 
n La Lean] (10) 
Sir Jro’) 
1e, Lowe p. P 
can VC the 2's and e's aro connected by the linear relation (1); the likelihood of the samplo (zi Va 11 n) 
Ho, _ectly calculated from (10) The Jacobian of the transformation S unity. In order to avoid end 
lications, We shall regard the values of 29 Xp tr -pH as 
D ” of these by maximum likelihood immediately gives 
s of these vs and of the 


e estimate 
dent linear combinations 


hich constitute indepen 
dent of the estimates of 


‘+= 6, = 0. To see the effe 
are indepen 


Cr dine 
Of t, eram 
the Ne x, (t "e aia consider instead the e(t = 12s Sce 
Other... — lees PTD We see that t tes of these & 
e a Paramete: d. ls 
Y thus maximize the expression for the condi 


tional likelihood 
i exe 5 éále? t^ logo an 
N tde, 2p41 
Was by at i 8 st estimated b 
ares p Stimate the parameters qi» bj. 0*9 of course, besi 
Biom. 43 


Y n—2p—c. 


y dividing the residual sum of 


My 


210 Miscellanea 


The estimates of the remaining parameters @ are the solutions of 


Wt m eth, (12) 
&0 Opes 00 
1 1 dese n 
This gives for 0 = a; 3 ete =O, (13) 
p+ 
5 14) 
and for ĝ = b; X ( X «a 0. ( 
p+ Mt-i)-j 
By (8) this can be written 
n p 8) 
x ^ X (aj01,&) = 0. qd 
t=p+1i=0 


These equations can be solved by successive approximation, taking the first approximation for the b; as 
the mean yield for the plots receiving the jth treatment. Using these values the a; can be approximate! 
and their values used to improve the approximation of the b;, and so forth. 


4. VARIANCES AND COVARIANCES OF ESTIMATES OF TREATMENT CONSTANTS 


, 5 3; š ^ i : i ; ich 
The variance-covariance matrix of parameter estimates is the inverse of the information matrix, Wh! 


É bv 

s approximately given by s a aL uo 
20,20;) |' 

We observe that if 0, = b, 0, = a; or o?, then 


ny a7) 
(ond "s 


nl ..8t ; (18) 
ab; b, 
as the information matrix for the b’s. 


Let V be the variance-covariance matrix for the b’s. We have 


We may thus consider merely the matrix 


n 19) 
yi2|l € « x «cx ai. 
O't=pt1 (t-k)=i (t-D-j 20) 
It is easily verified that V- = (AA) (AAy'Jo*, ( » Y 

2 

where A = (0j) (t = 1,2,...,¢;7 = 1,2,...,n), t 2) 
$ 2 

A = (api) (i= 1,2,....23 j = L2,...,n—p), i f 

r 0 

where a; is set equal to zero if k< 0 or >p. That is, the information matrix can be expressed in Ku 

A which depends only on the design (and which characterizes the design) and on A, which depenc with 


on the autoregressive coefficients. It seems reasonable to require that the design be symmetrie os 
respect to all treatments, which leads to the requirement that V-! be of the form aI + AE. One mus tege"? 
place certain restrictions on A. Itis easily seen that the value of n— p must equal me (m being an iU st P: 
the number of times each treatment occurs). As the last p treatments must be the same as the is are 
we need only consider the mc treatments. In all that follows it is assumed that an additional p Plo eigh 
used. It is clear that a repetition of a suitable design of me plots any number of times to make & 
of length Nic is also suitable. d the jm 
Conditions for suitable designs are easily seen. Let uf) be the number of times the ith an 
treatment occur r plots apart. Then the condition is that 9 
2 
(ui) = a, 1+ B.E 
for all r from 1 to p. 
Suitable designs are given by Williams (1952) for the cases: 


p-l «20; p=1, a+/,=0; p=2, a + f= af 0 
which he calls respectively types (ITb), (II a) and (III). 


00 7 —IEE———K———— 


o —————— 
— dd OO —————— —————— —"——— —— PST 
4 - ———— 


Miscellanea 211 


Tt is clear that i i 
‘hat when designs suitable for p = 1 are found we can generalize them to any p by re i 
ny eatin; 

each treatment over p adjacent plots, e.g. valli i s 
1212... would become 
1111...1, 222... 2, 111... 1, 222... 2,.... 

PAS ea? 

designs for first-order autoregression where 2 +, = 0 are easily constructed by writing a suitable 


Successi < 
Se = of permutations of the c treatments, we thus have a method of finding designs for any number 


As an example of the caleulation of these matrices consider the case c — 3, p — 2, n= 3m 2: 


1 1 
Kei l T g 
1 1 
E becomes a, 1 a d$ 
a a 1 à 
l 6, @ J 
Thus 
ya- Tarafta) t (Eae EI 
m 
= lary gatpal—a,— 0-409) + Gli 0i 23) E. 
Th e 
“refore the variance of the estimates of treatment differences, denoted by v, equals 
sa es (24) 
by (6), v= -n l—a,—8,t ai 0103 t 03 
3 the following examples the number of plots is N+p. p=—% is written when p = 1. 
MD = Les ea 
Design v = var (b,— ba) 
E o (25) 
[121212....] NE 
4g? (26) 


[11221122] — Wü" 
P *elative efficiency of (25) and (26) is 1-- 20/1 +p?) which is greater than one if pig positi vec 
O 5) anı 


Case: p =2 
=2,c=2; b, —6. 
i Design v = var (b, — b2) 
LEM UA 
[121212...] N(I-a ta» 
4g? (28) 
EN. v 
Consiq [11221122.] xqpxai-2a2: 92 
ler 
Row a more general case: ni..1 222...2, 
11..1 292...2 Uel = Nic 
i — E Ta 2 i 
hey, n "n a 
e 
PR. n... Sg = Ze aja iN: We have + 
i e (29) 
Aux 
MM MUI dl TE 
t BEIE a) -? E Euh 
; "m ; i 
1 i=0 : f ssible; otherwise 
the Sitzy,, s : ake the runs of 1's and 2's as long aS po 
Vs apa | t= | » 0 it is advantageous ou 4-2 
"T 


n > 
drs Should all equal p. 


212 Miscellanea 


Case: c = 3, p = 2, n = N +2 (for case p = 1, n = N + l substitute a, = 0, a, = — p): 
Design Bc var (b; — ;) 
6c? (30) 
123, 123 ... n_a 
pm ] N[1—a, —a,4-ai— aa, 4- a$] 
60? (31) 
2 231... — Á— 
[123, 312, 231 ...] ETET ETAN 
6 2 
[123,231,312..]  — o — (32) 
N[1—2a,4a54-aj—2a,a54-a3] 
12g? 
[11,22,33...] T (33) 


N[2 +a, + 2a} +d, d2— 2a, +203)" 
The last is a special case of the following: 


Wind :298...9. 9383...3.. 
SF foe I i 
p p p 
_ 6pc? (34) 
^ RUSSE [| 
x[» (£«) -,2A X se iil] 
0 2i-9j-0 


Cases (24), (25), (26) and (30) are included in formulae given by Williams. 


P ze gubject. 
In conclusion I wish to express my thanks to Dr P. Whittle who aroused my interest in this subje 


REFERENCE 


WirnrAws, R. M. (1952). Experimental designs for serially correlated observations. Biometri^* 
39, 151. 


The multivariate distribution of complex normal variables 


By R. A. WOODING 
Department of Scientific and Industrial Research, W. ellington, New Zealand 


INTRODUCTION 


‘or 

The expression for the distribution of several normal real variables with correlation was introduces Sa 
the first time by Bravais ( 1846). Since then it has been discussed by many writers, notably Fe 
(1896) and Cramér (1937, Chapter X). . gro real 

Its applications include the ease when the components of the multivariate distribution ® 
stochastic time variables which obey the normal law, as in random noise (Rice, 1944, 1945). tageo™ 

However, for certain problems connected with the envelope of a random noise signal, it is Mem p 
to consider a complex form, of which the signal is a component (Bunimovitch, 1949). A logical 5 oF ary 
derive the equivalent distribution of an ensemble of complex variates for which the real and im 
components are normally distributed. It is found that, provided the real and imaginary comp 
Tas Yn Obey the covariance relations 


one? 


Elma) = E(ysys), El mYn) = — (syn); 


the distribution of N complex variates v, has the simple form 


7-7" | L |-3exp( — V'*L-V), 
where L is the Hermitian covariance matrix. 


Since this does not appear to have been published before, a derivation based on the re 
expression is given here. 


al voriabl* 


Miscellanea 213 


DEFINITION OF COMPLEX VARIATES 


Consider A 
E e Aider N pairs of real variables z,(£), Y(t)» (1 <n <N), where t represents a sampling parameter. Let 
© complex variable : 
valt) = X(t) 4- iust) a) 
be such that it can be written in the form 


v,(t) = E (any— ns) exp {i0;(t)} (2) 
j 


In which the anj, bn; are real coefficients. This complex Fourier series arises in numerous fields, particularly 


in 
theory related to time series. 


Tom (1) and (2) we see that ; 
xlt) = E {ans COS 0j) + bassin G0), 
3 


y,(t) = X(-«s; sin O,{t) + bns cos 040) 
j 


Le. the Tns y, are in ‘phase quadrat ure’. This property is expressed compactly by the reciprocal Hilbert 


tra s 
‘nsformations (Bunimovitch, 1949, p. 1231) 


"A Z adt- 9) (3) 
TERI Z yen (52) 
mJ- 


Wh . 
di f signifies the Cauchy principal value of the integral. 


E RELATIONS 


CovVARIANC 1 
the following covariance relations, which are 


Usi : 
fung (3. 3), it is readily shown that the Cn Yn satisfy 
l: 


ni amenta 
Ti IF gts, t0) ys 
-o C 


EB(YmYn) = 5 
-IES Zaen 
piety (4a) 
§ = E(8mtn). ir 
M 
Pr EEn Yn) = — Eetatin 
te izi lations.Z(z,) = Elyn) = 9 
a „r with the standardizing re'a io ” (ys) = 
Epl. i relati , together with i [ o Free 
eor ey asthe in á eo 9 ao pelea of the complex variates will not involve Fo 
“Rents e initial postulate, our de 
T eo 
OW we consider the complex column vector > 
v=X+iY - 
‘ i variance matrix of form 
E" rom the v, in (1), the composite column vector (XY) will have a covari 
io > 
à Oe (6) 
vu ic. It is easily verified that à decomposition 
-symmetric. Iti 


r " r 
S the P Submatrix A is symmetric and B is skew 
°rresponding inverse matrix is " pA m 
F 4 —B[^4 als 7 
3 A-B) )( , Y $ a 
5 4) * ai apanb NE AN, 


214 Miscellanea. 


which is of the same form as (6), so we can write 


a= (5 4) = 2): @ 

B' A Q P 
with P = (A4 BAAB)A, (9a) 
Q ——(A4BA-3B)H BAA, (96) 
and P=P’, Q=-Q. (90) 
Then it follows from formula (9) that P— iQ = (A—iB)3, (10) 


NORMAL VARIATES 
When the real vectors X, Y are normal, the likelihood function of the composite vector (XY)is 
(27)-¥ | A | exp - HX Y) A-{X F), a» 
for which the exponent may be written 
THX Y’) ACXY] = — x -iY^ (PL iQ) (X iY), (13) 
since Q is skew-symmetric, while the decomposition 
( sl r (E py [E JU T XL ) 
Q' P, "E -Q XXe Bug 
shows that [A[? 2 | P-iQ | | P-ciQ| = |P-iQ|? =| A—iB|-s, (13) 
Combining the expressions (10)-(13), we now obtain the result 
(27)- | A — iB |? exp [- iX'—-iY)(A—iB)3 (X 4 iY)] ug 


£ x 
for the likelihood of the complex normal vector X +iY. But since the covariance matrix of the comple 
vector V = X +iY is the Hermitian matrix 


5 
Lz-E(VV'*) = 2?(A—iB), C 
16 
it follows that |Z| 2 2N| A—iB|, ( 
and that (14) can be expressed in the alternative form 
17) 
TN | L|-3exp(— V^*rAy) ( 
for the normal probability function of the N complex numbers v,. 
It will be noted that this simple result is true only if the relations (4a) and (4b) hold. 
THE CHARACTERISTIC FUNCTION 
For the density function (11), the characteristic function can be written 
; 18) 
(2m)-N | A. nf J exp [(R^S^) (X Y) — 4(X"Y") A(XYy]dXdY, f 
XJrY " 
ifaXdY = JJ dz, dy, and {RS} is a real column vector. This becomes 
n-1 9) 
; A k a 
exp [— 38") A{RS}] = exp[-}(R'—iS') (A —iB) (R 4-48], 
since B is skew-symmetric. Hence, in the complex-variate notation (18) and (19) give 
(20) 


7-N|L rf exp [i RL("*V)— V*L-3V]dV = exp(— 4T"*LT), 
y 


where dV = dXdY, T = R+iS, and Rl signifies that the real part is taken. 


Miscellanea 
215 


It is evid 
de koed ent from the above that th 
from real vari at the theory of the distributi 
ariate theory ibution of complex variates i 
2 is most readil 
y y 


The aut} š 
Mathe, hor wishes to acknowled; 
1ematies Labo: edge the valuable advice and su| esti r 
Publish this short Sie y, N.Z.D.S.LR., Wellington, and to iy [dc iuge of Applied 
A 4.0.5.1, tor permission to 


Bravars & TS, dos REFERENCES 
Mé dpud . Analyse mathémati jli 
Em Sci., Paris, 9, m Jespproba biie CE de situation d'un point. 
RETA a I. (1949). J. Tech. Phys. (U.S. 
Chang, H Kosponso of a quadratic device to no 
EARSOX HS. Random Variables and Probab 
, K. (1896). Mathematical contributions 


187, 25 
^; 253-318. 
i dom noise. Bell Syst. Tech. J. 23, 282-332. 
ded). Bell Syst. Tech. J. 24, 46-156 


S.R.), 19. Cited (p. 1361) b; 

, 19 x i y Magness, T.A. 
n-Gaussian noise. J. Appl. Phys. 25, d T 
ility Distributions, Camb. Math. Tract, no. 36. 

to the theory of evolution. III. Phil. Trana A, 


St vei 
ationarity conditions for stochastic processes of the autoregressive 
‘ and moving-average type 


By J. WISE 


University of Birmingham 


he Criteri ODUCTION AND SUMMARY* 
i T10) E " 
Biven by Wala the stationarity of a process of the autoregressive and moving-average types has been 
,Eeneralizati (1954) in tho form that the roots of & specified equation shall.lie inside the unit circle 
9Cess of th, ion of this result has been given by Doob (1953), who showed that a stationary EAR 
e autoregressive OF moving-average type has a rational spectral density function g(z), where 


INTR 


-4040 
9) = s) BE) (1) 


ommon factors, the roots of the equation 
(2) 


B(z) 20 
he equation 
(3) 


A(z) 29 


A(z) 
and p d 
(2) being polynomials in 2 with no c 


all ly; 
Ying insi 
Side the unit circle, and no root of th 


into a set of criteria which are 
d. No intermediate evaluation of 
sary and sufficient conditions for the stationarity 

i icitly in terms of the 
tting systematic 
lynomials up to (and 


yin 
E Outs; 
de ues the unit circle. 
ter ier these results due to 
of a cots of th rectly from the coefficie! 
Cog Stochasti e equation is necessary. 
trent cients Ex process of the autoregre: 
inep ent in uc process. These criteri 
a sot ing) th e general case. All the ele 
3 visas 6 fourth degree, are given, at 
ants ONES (but not sufficient) con 
which are of a very simple & 


ssive or moving- 
a are expresse i 
ments of the determinants in 
ditions are give 
lgebraic form. 


Met DERIVATION OF THE CRITERIA 


process 
(4) 


Consi 

ider the autoregressive moving-averase 
myt tytat 4p Uk = athiti + rêi- 
n material contained in Wise (19554). 


* This paper iS based 0: 


216 Miscellanea 


We may assume, without loss of generality, that the polynomials 


and 
have no common factor. 
Then, if the process (4) is stationary, all the roots of the equation 


V Sg 
A+ f zh. uui E f. 


zřk+ æ zh 4.. a= 0 (6) 

lie inside the unit circle, and none of the roots of the equation 
7 

2^ f zh. =0 (7) 


r " A : ; A : PEPNERITIMO 
lies outside the unit circle if the process is non-circular, while all the roots lie inside the unit circle if t! 
process is circular.* Let us consider the polynomial equation 


" 8 
mo m1 Lupa, = Q. & 


TS 
The roots of this equation are 


given by the reduction of the polynomial to real or complex linear facto! 
such that ` 


(9 
(z—24) (=x)... (r—2,) = 0. 


We now require the necessary and sufficient conditions for the satisfaction of the inequalities 


0 
|z;|«1 j= 1,2,8, ..., k). qo 
1 J 
Put y= and x=a+ib qn 
(where a and b are real). We find 
_ +b-1 2ib (12) 
© (a@=1)? +8? (a—1)? 482 
, - (13) 
= Ry) 4- iC(y), " 
where R(y) denotes the real part and iC(y) denotes the complex part of y. We have 
. 3. p3— 14) 
Ry) = MESSI ; e 
(a— 1)? - 5? 
2b - 85) 
Oyye —— ss 
—1 2 2 
(a—1)?+b (16) 
= 2 2 
|x| = (a? 4-02), an 
(a — 1)? -- b? 2 0. lj 
Thus from (14), (16) and (17) it follows that it is necessary and sufficient for | x | «1 that 18) 
! A R(y)<0. ( A 
From (11) we obtain ^ 
Q-y*l a9) 
E T 


80 that (10) is satisfied, if, and only if (20) 

R(y)«0 (j= 1,2,...,k), : 

where y,(j = 1,2, +.. k) are the roots of the equation (2n 
(y+ 1)*-E e (y +1) (y — 1) - ... c (y — 1)* = 0, 


of V: 
Which is obtained by substituting for « from (19) into (8). Expressing (21) in ascending power? 


we obtain k (22) 
PotPiyt+Pay?+...+p.y* = 0, 
k i (23) 
where Pr, = 2 956,5, 


j-0 
in the expression (y+ 1)*-/ (y — 1y.. 
4 are tabulated below for k = 1, 2,3, 4. 


% = land ¢,, is the coefficient of yt 
The values of the coefficients 6,. 


* See Wise (19550) for a discussion of circular 


+ og an 
" sive 
and non-circular processes of the autoregres 
moving-average type. 


Miscellanea. 217 


From : 
these the coefficients p, ...,p,are tabulated fork = 1, 2, 3, 4, expressed in terms ofthe coefficients 


1,0 rae 
"in 24, on the condition p, having been set equal to unity. 
; us 
ary and sufficient conditions for (20) to be satisfied, expressed as criteria involving only the 


Coeffici i i: 
Ds in equation (22), have been derived by Routh (1930) and are given in convenient form below. 
iteria are obtained as follows. We form the matrix P, where 
Pı Ps Ps Pı 
Po P2 Ps Po 
P= 2. Ps Ps |. (24) 
0 o P: Ps 
The werd 
Routh conditions then state that (20) is satisfied, and thus (10) is satisfied, if, and only if, the 


following ; 
Wing inequalities are satisfied (po having been put equal to unity): 
P; =p;>0 (j-1landj = k) 


whe: P,>0 
oe tr = 1,2,...,4—1)isthe principal minor of P formed from the firs' 
p. in ncipal minor formed from the first 5 rows and columns equals p 
for (10) ie be positive, this implies that p; is positive. A set of necessary, 

18 given by the inequalities 


trrowsand the first 7 columns. 
P,- and since both this and 
but not sufficient, conditions 


p,>0 (bie. (26) 
in terms of the coefficients of equation (8), for 


Expression 
AR essions fi - 5 iven below. 
k or p, (r = 1, 2, ...,k) are given ocur in practice.* From these values the criteria 


:2,3,4. Thi 
> ^ 4. This ases likely to oc ; 
pouce] Rer ese codi are given below, in expanded form, for k=1,2,3 
of equation (8) become unwieldy 


© Biven 5 
] nd 4, rs a may be obtained directly. Th e ate 
o; expansi inants in terms 0 A : 
: p values E r eed the T which k exceeds 4, it is more convenient to work with the 
E fingers in deraan [on Tt is worth noting, furthermore, that the M ioe n 
i Ved j eat a iyi equired to evaluate the roots o 
l erieapy, ag the P criteria is small relative to the labour req 
l Values of Cri 
k=2 
hw r " 1 2 
1 j N 
1 
0 1 2 
1 1 zzi 0 1 
1 2 1 -2 1 
k=4 
r 0 1 2 3 4 
2 3 j 
JL 
LOU" 
3 1 = =2 0 2 2 
1 1 2 1 0 -2 0 1 
EST 1 H i 2 9 =2 l 
-3 1 4 1 zd 6 —4 1 


, -+ 10, see Wise (1955). 


2 
" i for k=1,2 
* For more extensive tabulations 


218 Miscellanea 


Tables of the p-coefficients 


h= T 
l+a 
a= pa 0. 
1 
b= 2 
2(1— a) m 1+a@,+a&, 
Pi ao) Pe TG. 
k=3: 
_ (8B—&,— a+ 323) 
ill (1-242, —25) ” 
_ (8+, — 0 — 323) 
Pe T-a Fa, a)’ 
LO roS os 05) 
im (=a 42, — a)" 
k=4 
 2(2— 4+ d&s — 204) 
Pr = leE 
= 2(3— a, + 324) 
f: 7 aro, OF)" 
a petua 
Ps (ay as — as +04)’ 
= Fe tag boty a) 
47 -2a 2,204) 
* Tables of the P-criteria 
k=l: 
P, =p, 
_l+a 
cs -0 
5-2 
P,-p 
.. 3(1—-2o) 
TCü-axeay " 
P: =p 
Loro +a) 
Y (1—e o) ^ 
k=3: 
Pp 
(3—24 — ay 325) . 
aA 0, 
(Exam Fam) R 
P, = PiP2—Ps 
= (B= os das) (B04 — Hp Borg) (1-05 +00). o 
ae (oos ont (1—24 + 64 — 03) 
TP =p, f 
MICE +a +a) 


=z. 
(1—24, +a — 24) 


Miscellanea 2 
219 


Pom 


cob) 


P,=PiP2—Ps 
4(2 — a; as — 224) (8 Ae + Aa) 2(2+ a — 2 — 204) 
(1a, s — ts + a)? "ox pam S 


P= pi Paps- Pip- P3 
A s2—o, ta, ae ta Bea) n ta Pa 
(1 — a s Ma Fa)” 
go 
(1 — oy + Og tst a)? 


Aout OT 
= >, 
(a + Oa as ts)” 


= (12-2 + og E atta) 
d een 20. 
(aa, d, tata) 


REFERENCES 


Doo: 

B, J 

Ros» J. L. s 

WIDTH, E. ro Stochastic Processes. New York: J' ohn Wiley and Sons. 
0). Dynamics of a System of Rigid Bodies. London: Macmillan. 


Isp, J 
» J. (195 
5a). Ph.D. Thesis (unpublished), University of London. $ 
pectral density function. Biometrika, 42, 151. 


IS 
Wor? T. (195 
orp, H, us The autocorrelation function and thes 
). A Study in the Analysis of Stationary Time Series, Uppsala: Almqvist and Wicksell. 


Som 
e properties of an angular transformation for the correlation coefficient 


px B. I. HARLEY 


1 University College London 
Use. She 
3 Ppard (1899), in an early paper in which he discussed the association between two variables, 
ine of the correlation coefficient. As far as the writer 


kn, a quanti 

S tity which is equivalent to the are sin 

antity have not been discussed, although Jenkins (1954), following 
i cient. It is 


8 > th 
jEResti RD DE of such a qui 
put forward by others; did investigate the arc sine of the se 
z of Sheppard & stage further andin particular to 
and p that inthe normal population, then 


é(sin7!r) = sin"! p 


2 
. Let 
an, Lan 
B d ex y-5m T, 
se; and i ? f 
Ties, we oe a Taylor series about the point r = p. Using the moments of r expressed in the form of 
ave on taking expectations 9” rotaining terms of order n™ that 
(1) 


Sly) = sin P 

KJ) = Dudas ett (2) 

Kaly) rea] (3) 
(4) 


a1- P} F 
e EE ae Ys 


220 Miscellanea 


from which we deduce 


2 5n2 b 
Aly) = £ [ose enn]... (5) 
n n n= 
2 2 " (6) 
Pay) = 34> aee ni Saepe... 


For large n the quantity sin-'r is therefore approximately normally distributed. 
3. R. A. Fisher (1921) gave the normalizing transformation 


I 
z= flog, ic = tanh-!r, 


for which (making use of the correction given by A. K. Gayen (1951)) we have the moments 


p 5+p* e 
6(z) = tanh-! zo n {l+ +...) 
e) = tanh Pel "EE | 
1 4—p* 22—6p?— 3p! (8) 
cz) = 1 ——— ELE 
Kalz) zl Eam- 6(n— 1)? se 
: pe p 
and the £ coefficients BA -——— a. 
(n— 1) 
2 44+2p?—3p! a 
B(z) = 34 + E s 4 
n—1 (n—1)? 
Table 1. Comparison of the values of f, and B, for the distributions r, z and y 
n= 20 a 
p B,(r) f) fin | P(r) B2) Ai 
- as = | 
| 9281 
01 0-0148 10-°x0-1458 | 0-0044 2-7406 3-1164 P assi 
0-2 0-0600 10-5 x 0-9331 0-0175 2-8213 3-1166 see 
0-3 0-1386 10-5 x 0-1063 0-0397 2-9623 BELIGS 3.0470 
0-4 0-2557 10-°x0-5972 | 0-0716 | 3-1749 3:1170 " 
| 9.1162 
0-5 0-4202 10-5 x 0-2278 0-1136 | 34783 31172 es 
0-6 0-6464 10-5 x 0-6802 0.1666 |  3-9055 3-1173 3.2920 
0-7 0-9584 10-4 x 0-1715 0-2314 4-5131 3-1171 a UT 
0-8 1-4001 10-4 x 0-3822 0-3091 5:4154 3-1165 3.5105 
0-9 2-0603 10-4 x 0-7748 0-4004 | 6-8681 3-1154 * 


24 Ub 
. jl] ion i8 52^ 
Apart from its normalizing properties, the great practical advantage of the z transformation NT. 


Te vi 
it gives a variance which, to order (n— 1)-1, is independent of p. The y transformation does not hf ithou* 
Property but it provides a much neater value for the expectation, which is also independent of em al com^ 
going into the question of the application of the results, it is proposed to make some nume" 
parisons of the properties of the sin-1 r and tanh-!r distributions. 0-1) 0:9 are 
,.À comparison of the values of /,(z) and £,(2) with f, (y) and (y) for n = 20 and p = 01( eirici an 
given in Table 1. The values of A,(7) and f/,(r), as given in Tables for Statisticians and Tu vill 
vol. 2, are given for completeness. Tt will be noted that f(z) is much smaller than //,(y) ande loselY th? 
true for all values of p for n> 5, /,{2) is more independent of p than Ply) which follows more e efor” 
values for the r distribution. From an examination of the results it seems unlikely that the y tran 
tion will have a better normalizing effect than the z transformation. tbe 


on wit 
v. To compete tte probability integral of r obtained by using the sin-!r transformation 
values obtained by using the tanh-17 transformation we take as a first approximation 


y —sin?r, é6(y)-sin-!p, Kay) =(1 — p*)/(n — 2), 


Miscellanea zj 


the divi 2 e: n (2) when p is small 

isor, n— r i Sp P 

Usi or, 2, for the variance giving € i 

vum 1 e giving a good approximation t i 

i iir § n to the expansion (2 i 

for various ia e E ariance, a normal curve can be fitted and the probability n ie ob ^ ined 

alues of n and p. Asa second approximation a normal curve can be fitted using th . ie 

ariou ‘he values for 


he mea: 
n and vari 5 2 
nd variance given in equations (1) and (2) respectively. 


Table 2. C i 
. Comparison of exact and approximate values of the probability integral of r 


(i) n = 25, p= 02 


d Exact value dione | Fromz | Fromy |  Fromy 
u- Ist approx. | 2nd approx. Ist approx. | 2nd approx. 
mes = = | aps E |- — = 

-02 atus | | | 
i 0.02748 | 0.02860 | 002716 | 0:02 435 0-02 439 
à 0-07 379 0-07 758 0-07 444 0-06 999 0-07 005 
0:1 0-16 364 0.17083 | 0.16542 | 016217 0-16 225 
"i 0-30 630 0.31551 | 030807 | 031020 0-31 025 
0-49 168 0-50 000 0.49179 | 0-50 000 0-50 000 
0-: xx | 
Pa 0-68 635 0-69 177 0-68 406 | 0-69 350 0-69 344 
05 0-84 700 0-84 994 0-984533 | 0-84 818 0-84 810 
0-8 0.94615 | 0-94 798 0-94 592 0-94 263 0-94 258 
Us 0-98 820 0-98 928 0.98875 | 0:98 477 0-98 475 
0.90877 | 0:99 909 0.99903 | 0-99 752 0-99 751 
(ii) n = 25, p= 0:6 
T Ex , From 2 From 2 From y From y 
xect'valus Ist approx. 2nd approx. lst approx. 2nd approx. 
— | | 
ae 0-00 934 0-01 072 0:00 884 | 0-00 402 0-00 424 
"E 0:03 097 | 0-03 598 0.03078 | 0-02 112 0-02 186 
Pe 0.00046 | 010311 | 909147 | 008216 0:08 365 
ar 0.22771 | 0:34 994 | 0.32971 | 0-23 614 0:23 769 
8 0-47 500 | 0-50 000 | 0:47 521 | 0-50 000 0-50 000 
i | | 
Me 0-77 782 0-79 299 0-77584 | 0-78 544 0-78 381 
0. ? 0:89 652 | 0-90 531 0-89 543 | 0-88 995 0-88 833 
08 | 0-06 741 o97140 | 096769 | oo5555 | 0:05 442 
Bo 0-99 469 | 0-99 586 | 0.99520 | 0-98 722 0-98 670 
| | 


Sim; 
Mila, 
rly, i : 
Y. if z = tanh-! r, we take as a first approximation 


Ks(Z) = 1f(n— 3). 


é(z) = tanh^ p. 
and variance gi 


for the mean. ven in equations (7) 
ith the appropriate mean and variance. The 
i methods, toget! 


take the values 


ang then 
fit a norma. 


» as 
Valy, Mera Second approximation, 
Voted 9, the eee and in each case 
P 98 as gi probability integral of 7 obtained by these ap 
furo ade in F. N. David's tables (1938), for the case wher 
tig he e examination of the results it is seen that in the case of ti 
"is oo) aS of the expansions for the mean and variance of z increases th 
the second approximation for z are closer 
he variance 


th, ‘Ons; eal 

oan. sider: 

hy 8n erably, and the values using : 
r Sho H Of the other approximations. Increasing the number of terms of the expansion for the varian 
“Vong, 5e of the y distribution makes Very Jittle difference, since the skewness of the y distribution 


S it y a 
from being as good as the 2 distribution. 


he z trans 
e accuracy of the approxima- 
to the exact values 


222 Miscellanea 


5. Previously (1954) we have given the Cornish-Fisher (1937) form of the Edgeworth expans 
fitted to the distribution of tanh-r, using the values of the first four cumulants to the full known 
accuracy of their expansions. For the purposes of comparison we use the same series expansion 
approximate to the distribution of sin-!z, using the cumulants as given in equations (1)-(4). The 
probability integral values for the cases n = 16, p = 0-2 and n = 25, p = 0:9, are given in Table 3. 


ion 


Table 3. Comparison of exact and approximate values of the probability integral of r 


(i) n= 16, p = 0-2, f, (y) = 0-0215, (y) = 2-9441 


| 
7 Graat valie Edgeworth 
expansion for y 
— 0-2 0-06 711 0-06 722 
zT 0-12 868 0-12 953 
0 0-22 076 | 0-22 202 
0- | 0-34 349 i 0-34 367 
0-2 | 0-48 934 | 0-48 806 
| 
0-3 0-64 273 | 0-64 123 
0-4 0-78 316 | 0-78 297 
05 0-89 186 | 0-89 285 
0-6 | 0-95 947 | 0-96 026 
0-7 | 0-99 034 | 0-99 035 


(ii) n = 25, p = 0-9, £,(y) = 0-0319, faly) = 3-4403, P,(z) = 10-1 x 0-3844, fa(z) = 30897 


| ] 
z Edgeworth Edgeworth 
L Es à 
ene expansion for y expansion for 2 
a x 
0-75 0-00 743 0-00 684 0-00 742 
0-82 0-05 574 0-05 834 0-05 578 
0-87 0-22 387 0-22 896 0-22 379 
0-90 | 0-46 244 0-45 513 0-46 247 
0-93 0-78 645 0-78 143 0-78 661 
0-95 0-94 612 0-95 047 0-94 612 
0-965 0-99 263 0-99 660 0-99 264 
I 


s from 
This method of approximation is unlikely to give as good values from the y transformation PS pans” 

the z, because the former has not secured so close an approach to normality as the latter. The em ctl 

formation may be useful as a quick check, however, for since the expectation of y is known 

a number of the terms in the Edgeworth expansion is zero, and it is simple to compute. 


6. From the moments of $2 it appears that 
4(sin-! r) = sin-! p 


up to terms of order n-. The identity holds in fact for any order of n^. 
The probability distribution of r, the correlation coefficient, calculated from a sample O v5 
randomly drawn from a normal bivariate population with correlation coefficient p is known to 


(1—p2)i-0 qn? (a= ( sar) : (1m 


ZG (Leer te Es (rt) 


f size fin 


pern.p)— 


| Miscellanea . 


| 223 

Since it is a probability distribution, 

+1 d^-? pr et 7(n—3)! 

1 
q J —r2)ka—-9 — —— (os dr = — uu = Calp). 
| [ama aem 1^ 7 ace y 

Denoting 2" (cos! (=?) " i i 

g Tep (mm) by f^-*(rp). and rearranging the integral, we have 
+1 1 E 
(ee n-8(rp) = C,(p)- (12) 
10-7) 


Int : 
*grating by parts we have E I 
C,(p) = [sin (1 gym f 2-2 rp) ie sin rz geom frg) ae 
Do $ * 
n> in- 2 x n-2 i i E i 
feelers’ [sinc r(1— rti f n-3(rp)]*t = 0, since frp) is finite and non-zero at the limits +1 

+1 

n-2(rp)dr+(n— »[ f rsin-ir(1 — 72)0-9f"7*(rp) dr, 


"EL d 
Clp) = — sin-1 r(1—72)/-9 p x 


and hence =i 
[o] +1 ior e (ri — 12€ (p)}dr. (18) 
E =A sin-t r1  rtyie-s fep d+] T P) En-alp)} dr. 

"E | i 

noting the expectation of sin 7, when is calculated from a sample of size n, by &,(sin- r), we have 

d 
Z fm sanan E onat |: 14) 
n(p) = — pC nlp) Enp (in r)+(n— 3) ew ip En- (SiN r)+ &n-al ee S ( 
Nubes 
"mn the values of the C's and rearranging We have 
| “n me é „(sin™ r) = a [£i in" r)— épp (Sin 7). (15) 
a- 9p "S = 
p r) = sin- p, since we have assumed n> 2, it follows that 
" To 
Epp (Sin? r)- £a (sin? r) = sin™ P- 
Tom 
(12) when n = 3 we have 
; " 
Clp) = n ü- py dr 
= E 
Dm preos LTD iL —pC,(p) &,(sin-! r). 
= [f zap arp} er 

Feno 7 1 Tp sin?p _ pr i 4 (sint) 

An, cp peg ap) CRA T 
d therefore é,(sin n= sin! p. Ma m. E 
Fro; i Gin" = sin- p and, in fact, San sin-1r) = sint p 

E n = 3 qam (15) it follows that Ue m i Ato same method that £,,(sin-7) for odd sample 
MN It has not been possible 

S equal to sin- p. However, if we assume AU Ad) has 
&,(sin r) = sin Dit sty Sm 
ions of P» then substituting in equation (15) we 


for 


all 
have values of n, where A,(p) for 8 = 4, b, ... 8 


1 zu] 
l Jean e 


O= 2 in e 
iaoa Ale fara Pat Sale) y 
B nsu incar ult 


Ns 24,5. 
1 MINE 2,40 forsch 
:milarl. , we find that 0 = A,(p) = Asp) = «+ 
Ù Equating coefficients of n-3 we find that 44(P) = 0. simie ip for all values of n. 
reasonable to assume that ĉn eu esp 


8 
Sems, therefore: 


224 Miscellanea 


7. In his paper ‘New light on the correlation coefficient and its transforms’, H. Hotelling (1953. 
p. 221) states as a theorem: v 

‘No function y(r), independent of p and n, satisfying the conditions for expansibility in a Taylor series 
through terms of fourth order, and termwise integrable to give the expectation in a series of powers 9 
n}, exists such that é(Vr(r)) = V(p)." 

We think, however, that this conclusion is due to a mistake, for if the substitution which Hotelling 
refers to in the penultimate line of his p. 220 is made, we find that the coefficient of n~? in his earlier 
expansion for &(y/(r)) — (p) vanishes identically for all values of p. 


I wish to thank Dr F. N. David and Dr D. E. Barton for the helpful suggestions they made during the 
preparation of this paper. 


REFERENCES * 


Cornisu, E. A. & FrsugR, R. A. (1937). Rev. Inst. int. Statist., Paris, 4, 307. 

Davin, F. N. (1938). Tables of the Ordinates and Probability Integral of the Distribution of the Correlation 
Coefficient in Small Samples. Cambridge University Press. 

FisHER, R. A. (1921). Metron, 1, no. 4, p. 1. 

GAYEN, A. K. (1951). Biometrika, 38, 219. 

Hartey, B. I. (1954). Biometrika, 41, 278. 

HOoTELLING, H. (1953). J. R. Statist. Soc. B, 15, 193. 

JENKINS, G. M. (1954). Biometrika, 41, 261. 

PEARSON, K. (1931). Tables for Statisticians and Biometricians, 


2. Cambrid i ity Press. 
SurPPARD, W. F. (1899). Phil. Trans. A, 192, 101. POS sara 


Note on the moment-problem for unimodal distributions when 
one or both terminals are known 


By C. L. MALLOWS 
University College London 


1. Several mathematical models have been proposed in attempts to re: 
for example, to counts of species in randomly chosen regions. Anscomb 
compares eight different two-parameter distributions. He remarks that of these distributions, 807? 
have only one mode, some one or two, while some may have any number from one u pwards- This 
progression is reflected in the behaviour of the third and fourth cumulants; the number of modes appe 
to increase with a decrease in both skewness and leptokurtosis, 4 

It is of interest to determine whether in fact this must be so; more Specifieally, to determine the 
'unimodality boundary’ for such distributions, i.e. & boundary defining values ofthe lower moment? 
such that any distribution having these moments must of necessity bo multimodal, 


Present biological data relating ; 
e (1950) gives a discussion: ® 


2. The necessary and sufficient conditions which the oe of moments Hos flys «2-5 fin most satislY 
in order that a (cumulative) distribution function $(z) having these moments may exist aro know? ? 
the situations where x 


(a) may take any value between — oo and +3 
(6) is restricted to positive values; i 
(c) is restricted to lie in a finite interval (4, tz) (see, for example, Shohat & Tamarkin (1943))- 


By means of a transformation Johnson & Rogers (1951) reduced the case (a), where the addition? 
restriction is applied that the distribution 18 to be ned to the case (a) without restriction- 
now shown that this transformation can also be applied i the cases (b) and (c) above. 

These results apply immediately only to d e distributions; however, we can obtain results 
for the discrete case by considering the distribu ene the Sum: of the discrete variable and an in' 
pendent rectangular variable. This will have & Stogram’ distribution which may be regarde! 
continuous. 


Miscellanea 225 


3. iti : 
The conditions referred to in §2 are as follows (moments about zero throughout)*: 
C m " 

ase (a): t, = — co, t, = +00. The determinants 


E z = 
E aiios AQ) = [istis (P= 0s Boi 


Case (b); t, = 0, tg = +00. The determinants 


are Positive, A“) (r= 0,1,-.-.[3%]) Aj) = | Missa lí;e (7=0,1,-.5 [1n — 1)]) 


Case (c): tà 2 —1, t, = +1. If n is even = 2m, the determinants 


" Ag), At) = Hs Hiriei (m Olam) 
ae 
? Positive; if n = 2m-+1, the determinants 
Mi M = — f = 
re positive, Al (H) = | Hiss inii I;-o ATQ) = | Hess Pissa liso C ONL, 2099) 
t The transformation used by Johnson & Rogers in the case (a) is as follows. Supppso ġ”(x) exists 
all x, and qam. E 


$29 (=P) i 
«0 (>f) 4 


de at f. Note that we may have $"(v) vanishing over an 
ode. Any unimodal distribution (continuous, but whose 
stribution) can be approximated as closely as 


T 
i ente distribution is unimodal, and has a mo 
Secon d including £, so that there is no unique m ^ 
4 desireq derivative need not exist; e.g. * histogram 
o by such a (a). 
nsider the function y(x) given by 


f y'a) = 6-9 g(x), y-o) 7 9. 


Th 
- ^ V() is a distribution function with moments 


à = [vane = (r+ 1) Hr- Pres 


s of §3 (a); We thus obtain limits for f for the given moments 
nr conditions on the moments themselves. 
the following form: 


1 


ang re 
Uy, ,°° the v’s must satisfy the conditio 


k " 
» +++; ln, and may derive the requir 


The determinant A,(v) can be manipulated inte 
1 
4 
^ 0 


+j- Dea [Eor 
ases (b) and (c) above. We obtain the following 


Whi 
| Whig å js 
S h will be denoted by AQ) = | Ps 
5, sod i ei 
E The Same transformation can be app pes e ih T 

da Where the moments aro about Zero rd exist with a mode at £, it is necessary and sufficient 
that, a unimodal solution of the moment probe 


- oo. The determinants 


TN 
' b). i = = 
(b). Terminals t, = 0, t = (r= 9,1, [3n—1)) 


X A(v) (r= 0,1,- [hn]. Aj =l: 


= Positive, 
he distribution funetion does not reduce to 


s of t 
f the spec mninants will vanish for large r. 
e 


(i3) Ito 


e ints; in the e th 
Set of poin contrary cas Biom. 43 


* 
ee, Th, A T 
& g ; ese ts apply strictl ly i 
fini t results apply strictly only 

i. 


226 Miscellanea 
Case (c). Terminals 4 = — 1, t, = +1. If n = 2m, the determinants 
Av), A70) =|; +j- D) Hissa — (+j + 1) iss liano (r= 0, l,...,m) 
are positive; if n = 2m+1, the determinants 


AT (v) = | gis +j- 1) Kiss- t (2-7) Hiji krag 


Aro) =|; +j- 1) i-a — (È+) tia ee 


| (r0, 1,...,m) 
are positive. 


6. (i) Thus in case (c) (terminals — 1, + 1), where the moments Ho = 1, pt, h 


2 are known, the first few 
conditions reduce to 


(8 — IY < 3ta — 13) <1 — 3p + 9p, f, 


and hence we must have (since — 1 <f<1) 


0<3( re i Oe da] (44 7 0), 
dio Q-4)0—32) (i0). 


These are inequalities for the moments ds 


Kz when the terminals are given as — 1, +1. We may write 
them also as inequalities for the terminals ty 


» ta when the moments are given as Ha = 0, p, = l; we find 
3-261, +8<0 (& +t,<0), 
9255-50 (6-1 0). 


(ii) In ease (b) (lower terminal zero) the first few conditions are: 


0c20—5, 0 «3(s — n) — (8 — uy, 
O< 8/15 — 945 + 2 (3p, us — 2) + A? 3j; — 4p), 
O< 15, ju, — 1692 + 48 fly ha — 204? py — 2743 
— PLC 5 fy + 129, ps — 10 pty — 18%, 43) 
"EPI pg + 902 — õla 


7. Suppose y is a random variable takin 
Nor Mis -.., and u is an independent rectan 


— 1245 k) — (129, fla — Ap, — 8x3). 


g only the values OFS. 


- and having moments (about zero) 
gular variable (unifo 


rm in (0, 1)). Then z = y +u has moments 


» (n.n Eee 
=>) (5 - 2. ( ) j 
E NET ral 3 )% 


0<27,+1-2, 0301511) -3— (85, — 8, 
087,1, — 93+ 4j, — 6 p, — 


A general investigation of the implications of 
cannot standardize the moments; ‘location’ and ‘scale’ are determined by the lower terminal (zero) 
and the interval (unity). 

Anscombe compared four distributions having 7, 
(i) negative binomial, (ii) Polya—Aeppli, (iii) Neyman Type A, (i 
(ii) two, and (iii) and (iv) each three. The third and fourth mo 


1 — plah — 671 Na + 3, — 673 —3,) + ÉN(39,— 4yit—m). 


Miscellanea 227 


REFERENCES 


AxscoxBE, F. J. (1950). Sampling theory of the negative binomial and logarithmic series distributions. 


Biometrika, 37, 358-82. 
Jounson, N. L. & Rogers, C. A. 

Statist. 22, 433-9. 
Sonar, J. A. & TAMARKIN, 

American Mathematical Society. 


(1951). The moment problem for unimodal distributions. Ann. Math. 


J. D. (1943). The problem of moments. Mathematical Surveys, no. 1. 


On inverting a class of patterned matrices 


By S. N. ROY snp A. E. SARHAN* 


Institute of Statistics, University of North Carolina 
i S i imed letters for their 
ld capital letters for matrices and primed le 
a bold are for column vectors, small bold primed letters for 
ector whose typical element is a; and (ab/c) for 


l. Introduction. We shall use here 
transposes, small letters for scalars, sn 


TOW vectors. For example, a will stand for a column v poe pe A EA 
& column vector whose typical element is abilo Da (Ex E) wi 


iagonal el t (apa a,) and M(k x k) will stand for a lower triangular matrix. If all 
elements are (d, +++ 


clements of M are unity we shall write M(k x 1) as J(k x0). 


2 
9. Yr 
b, 0 
Ms (ex k) = yb, agba 0 EE 0 : A 
aibj E EC S ES 
ay by aby arbs + axb 
then it is easy to check that ; ; 
yia Oe Oks a i é 
—1/bo% 1/bo4g 0 .- a M ec A 
Mahe x k) = ay, Ee rer 
i —]l[bjay l/brax 
0 0 e 
M Cint 
0, 6303 pe 
1 €, C301 1&3 
cia 1⁄2 = Rete 
$(a, + 42) ca C3(i + 42 E > a 
c es ne +) (atas +a) eg Cs (1 +42 3 
=| ccm 000777 3 eF 
» 3(a, + dg t+ +n) 
CC (a, +42) es es (ta + 42+) cia + d» 
c,0401 Carn : ) 
W NT as 3 0 ANTO 
E (5*2) ET 
Izd L Qn e 0 
1 1 (b = n 
Gea, Ma "s Ca C33 ; i 
di (s 1 Jost + ~) " 1 o 
: C335 lo as C304 04 P ! 
: | 1 
: i EN oo 
e invert both sides and use (2:1) and (2:2). 
i E i rS NCC igh the Office of 
i. vie koe he United States Air Force throug’ 
i i i pt, by the a 
* This research was supported, 1 pai : by spoil n i 


“lentifie Research of the Air Research an 


228 Miscellanea 


Example 1. We need to invert for various purposes the following symmetric positive-definite md 
which is the variance-covariance matrix of order statistics for a sample of size n from an exponential 
ion (Sarhan, 1954): 


populati 
1 1 $ 1 
n* n nē ar) 
1 1 1 1 1 i E 1 
n nt Gop nat ope UO n naie 
Vân xn) = 1 1 1 à 1 3 1 5 (3:3) 
ni mta- à mirne = asin 
3 
Bw: i le E M 
n? n? (n—1)* ie1(n—i 1)? i=1ı (n—i+ 1)? 
This can be put in the form (3-1) by putting 
6-6... Sm 
1 a 1 
Em EL. e" = 3-4) 
and a m’ ay m-i a3 (n—2)g*' v 24-1. ( 
Therefore, by substituting in (3-2), we have 
n? + (n — 1)? —(n—1) 0 Ü x. 0 0 
vaa] -o-1* (n.— 1? -- (n — 2e —(n-29? 9 aw 0 (3-5) 
0 0 0 =] og 
This inverse can also be obtained by induction. 
4. If 
Cle x k) = [Dalk x E) + AqQ x 1) r^(1 x ky), (4:1) 
À 
then SS at ptt (r/py. (#2) 
(ia X qir; r) 
1 
Proof. Assume that 
O7 = Dus (Ex k) + n(a]p) (r/py, (£3) 
where y is unknown. "Therefore i BN 
(kx &) = GC = (p, +Aar’) (D,,, 4. Alap) (r/p)’) 
k 
=I talip + Aq(r]py + Ay E (%7:/P,) q(r]py 
i- 
k 
=I+ (re = undo.) q(r/p)'. 
pe 
k 
Thus ZU +A ke di ^in. +A=0 (4:4) 
i- 
or 


k 
b= -a](isa È undp) , 
e xl 
which completes the proof of (4-2). i 


Example 2. We need to invert the following Symmetric positive-definite matrix which is the variance- 
covariance matrix of the multinomial distribution: 


Pi(1—p,) TPP —PiPs 


PPk 
TPP? Pa(l-p,) PPa T 27y273 
V(k = j 
(to CAMS -PPs Pl-p) d 


PsP, 


Pip, Pep, PPk —Pi(1—p,) 


Miscellanea - 


This can be put in the form 
D (ex E) -Ap(E x 1) p'(1 x E), 


Where A = — 1, that is, 
Pı 0 Pi 
D» 
v= i -[* 
7 Pj] (Pi--- Px) 
0 3 
Therefore, from (4:2) we have i : 
1 
V= = Dim; t k g 
1- È p; 
i=1 
1 1 l 
BE § 
icp 1-2 1- X 
i=1 e à 
1 1 - 
E pe pe ME "m 
=| =a "a iE 
E EEST eu j PES 
1 i : > 
E $ Du 
1-3 pi a m ta- m 
i21 sal E 
5. If 
QC((n4-p) x (n+p) fub 
n nap = à 
p POS NaS Dp P" 
n p 
li J J 
PN mr gite p n 
^| a eur] i 
m aii 5 
n p 
Bs. ue e at Se oe 
É k(nap — mk)’ na?p — mk m(na%p — mk) a 
Pro 
We hr Assuming C-! to be of the general structure (5:2) and equating CC^! to H{(n +p) x (n p) 
i a 
goeveet + paps Posi vee i 
he 
1 
pde 3 d +ayps eee ae ü 
5 , 
3 n P 
ym anf = 0, d 


a 
ak+pap=0, Pk+—+ayp = 0, 
from m 
whi 
hich, by solving for a, f and y, we have (5:3) and thus (5-2). 
ingular matrix which occurs in analysis 


Ezra 
m: 
for E 3. We need to invert the following non-s 
ay classification with equal frequencies in the different cells 


c p, J\m 
zi p,Jk-1 


m k-1l 


of variance 


(5:5) 


230 Miscellanea 


n 7 ituting in 
This is easily seen to be a special case of (5-1) by putting a = 1, n = m, p = k—1. Now substituting 
(5-3) and (5-2) we have 


1 k=1 1 
T. e J c m ". 
di 1 I 1 
—-—J —I+-J/k-1 
m m m 
m k—1 
6. If D. J Ji E 
C(3k-2)x(3k-2)9 2| J D, 3J |x—1, ( 
J J Djxk-1 
k k-1k-1 
then Dita — BI yJ Vk | 
G=] 2I  DueH gI jai (6:2) 
yJ $J . Dy + EI) k—1 
k k-1 k-1 
whera PTE poya ie dedi and dann. (6:8) 


Proof. Assuming G- to be of the general structure (6-2) and equating CC-! to I{(3k—2) x oF p 
we have, as in the case of (6-2), six linear equations in the six unknowns g, P. y, à, É, &, from which, 
solving, we have the unknowns in the form (6-3). 


The matrices C and C-1 of (6-1) and ( 6-2) occur in analysis of variance for Latin squares. 


"SEE 3 
"wb G S. $ 
by g 
+1 
CGkxk)-lb aq c d d |. (mal 
Di TAI ds. 
then e f Z w JA 
UFU e A uns X 73) 
G+=/[ Fr A Toh WT Pr Us 
DEA A Wn g 
hi = so 4a = aul ea 
where 2 +Ab?(k— 1)], J= — àb, 9 = Faq - Mad — 03), h = À(b?— ad)/(c — d), 
and A= Mate — d) + (:—1) (ad — 62], (7a 
Proof. The right-hand side of (7-1) can be written in the form 
a bi’ 1 (7:5) 
bt D.4td3) k— 1’ 
1 k-1 
We now assume that the right-hand side of (7-2) can be written in the form 
( e fat ) 1 (7:0) 
1 D,,t4J] k—] 
T k—1 
where e, f, g, h are undetermined, and 1^ — (t Po TAS 


Multiplying (7:5) with (7-6) and equating to I(k x k), we have 


ae - bf (k — 1) af Y bg) B Ly? 
n d TAH af 1 b(g —h) 1+bh(k— 1) SJ (0—d) (9-5) EX-A(o — x) J ; (1-7) 
0 -1 


t d(g — h) J -- dh(k — 
Tei (g—h) J +dh( 1)J 


Miscellanea 931 


Thus we should have 


ae+bf(k—-1) = 1, af +b(g—h) + bh(k—1) = 0 (c—d)(g—h) = 1. 
(7:8) 


bf -h(c— d) dig — 1) + dh D) = 0, 


Solvi ; 
olving which we obtain (7:3). 


The Miu DLI1Y; 
matrix (7-1) is common in response surface G and 7 can also be subsumed under 4. 


s. 5, 


Th 
he authors would like to thank Prof. M. G. Kendall for his valuable editorial suggestions. 


REFERENCE 


tandard deviation by order statistics. Ann. 


Sang 
Moi A. E. (1954). Estimation of the mean and the s 
ath. Statist. 25, 317. 


r involved in the sequential ratio test 


A note on the risks of erro 


By J. MEDHI 


Institut de Statistique, University of Paris 


1, INTRODUCTION 
d by Wald, at first involves the successive 


For 

testi P 

caleulation mas the sequential procedure, formulate: 
Z, =p Z=% to aa = Baten 20 


Whe: (2s) 
re zi = z(t) = log, C)" 


discrete distribution) of x corresponding to the 
if at any stage Zza, Hy, is accepted; if Z<b, 
be taken. Let the chance of error 
n H, is true be f. Then we have 


the M 
PN D. ities fo(w), f,(w) (the probabilities Po: P1 in case of 
ti is A Selene hypotheses Ho; H, respectively- Then, ny 
acce cepted and in case b « Z « dent sample* is to 
ee bien H, when H, is true H, whe 
ximately a = log (1 —4) log 
8 e sequential analysis theory 


oni mara. As a solution to the important pro oen t 
uous and diserete—of time series, & test procedure based on such an application 


y E $ 
o 5 artlett (1954). Here the criterion is the log likelihood ratio, log (2o/P1)» 
to £3 for the sample for each of the hypotheses, Ho (discrote spectrum), H, (con 


enti Maximized with respect to the unknown parameters involved in the specifications. We regard the 
p ions as the first of & hypothetical sequence of independent samples. 


Ire avail 
able s ati ; 
hn oiy -— epting H,when Hyistrue and vice versa to be equal, being at most 


en i 
p take the risks of error in ace ENE 
eater than e, we may; following sequential theory, adopt the decision rule: 


(i) i E 
Jit log (Pop) Sope , then accept Ho 
€ 


(ii) j 
) flog (Po/P1) < loz then accept Hı» 
en reach no decision (see also Medhi (1956)). 
le is not small, such à procedure 


th 
le availab! 
s therefore advisable to investi- 


(iii) ig op LE 
Wee € 
og — > log (polP1)> lo 5 Ze’ 
vith the single sample Ava Ti 
sks e. Iti 
l analysis we know that 


No 

Now zi 

Might wee the probability of 2 decision v 

Bate 4 ad to a rather over-cautious assessment 0 L k Y 

the e more precise risks of error in such circumstances; 1n cd sequential 

con ual procedure gives a good approximati :n most cases. In this no 
he circums 


i 
dered to obtain such risks of error in th 


denote 2 group [o 


te an explicit example has been 


: ; " , 
* The symbol z may f observations, 1.6. & sample'. 


282 Miscellanea 


2. MEAN OF A NORMAL SAMPLE 


(r—m)* e 
Let f(z) = Tz z P [- EE (m 0), 
and let the two alternative hypotheses be 
Hy: mean = +m, Hy: mean = —m, 
en fic) _ 2m 


2; = z(v;) = log 


de 1. 4 
Ae) c 
We suppose that the hypothesis H, obtains. 
Let pp?) denote the probability of arriving at the right (wrong) decision at the ith step, 


(7) — 


pine pa... p, pmmgma. pem. 
Now g(z), the frequency function of 2, is given by 


E (e = mjo). 


1 
so = Tg- 


The frequency function 91(Z4) of Z, = z is of exactly the same form as g(z), so that we have 


eo o 
7 = I a\Z) az, = | giz) dz, 
a a 


b b 
and pe = J a\%) az, = f gc) de. 
—o -%0 


In(Zn), the conditional distribution of Za. — h.e z 
tion* (Samuelson, 1948), 


a 
94.) = f NZa — 5) g,. (5) ds. 
We have here p 


a 
942.) = Jh $.—5)95)ds. (—co eZ, eco) 


where F(x) =f” S(t) dt. 
Hence quss 1 gY( Z5) dZ, 
a 
b 
and rf = m. | s) dz, = 4«^ A jen exp (— w) (Peto) F(t+e)} dt, 
where b, = (b-o7)/(,/20,), e= (oi—-2a)(42o,, of = (ei— 2b)/(,\/20}). 
Again wll) = f qued nls) ds 
32, — 201)? 
“wren, m we) 
bo xr" VENT 
xexp BX a L—(b— Je 
{-2¢ m a x[ " (a wJ- D v ( i») 
so that p? =f 93(Z,) dZ,, 
a 
and 


b 
me = [ 9324) AZ, =f” "gs 
w -—- alla 3 T7 y J20, Fi Fs Kadsai, 


(2:1) 


(2:3) 


(2:3) 


(2:4) 


(2:9) 


m is given for n>1 by the ‘truncated convolu- 


(2:0) 


(27) 


(2:8 


(2:9) 


(2:10) 
(2:10) 


(212) 


Miscellanea 233 


where 
l HE oi ME. Jj 
1 1 Ein 1 
K,- ex iB, K= — Ee 3 
+ Gm) se Tea mal 30i ; 


K, = F(J3(a— 35)0) — Pb — iso, ba = J35/(30,) - 5430; 


Proceedi 
eedin; i i 
g precisely in the same way, we can calculate g,(Z,) and then pP, p® and so on. 


2.30 


240 


1:50 


— log,, (0-05) = 1:30 
id 0 020 0-40 0-60 
Fig. 1 


W 3. NUMERICAL ILLUSTRATION 
9t — à 
(i) Lee a 0-05 and choose c, such that pẹ is fairly high. 
alow) Py = 0-7995, then 7; = 3-4080. 
Ulations give 


p = 000511, p® = 000077, p® = 0-00010 


80 th 
‘aus pe) = 0-00598. 


Tou, 
ii 3 eer of pl! obtained graphically is 0-006. 
a ct p = 0-6206, then e, = 27006. 
us p® = 000733, po = 0-00246, p® = 0-00063, 
" 


ation gives pP = oou roughly. 
= 0-50 and e, = (2a) = 2-4267. In 


an 
a thus pa 
w 
which implies that p 


Eo. 0-01042 and graphical evalu 
this us ©, be such that p? is maximum, 
p® = 0-0075, p® = 0-0036, p® = 0-0011, 


Biving ta) S 

. In MC = 0:0122, and 5?! obtained graphically is 0-014 roughly. | . 

Inter ach of the three REA p? and po were evaluated by numerical methods. We considered the 
Wong ae (d,, b,) instead of ( heh sv caso o£ ph and (ds, bs) instead of ( — oo, b,) in the case of 9; ds, ds 
855 t h Osen such that the corresponding integrands become negligibly small for values of the argument 
an d, in the former case and for values less than d, in the latter case. 


234 Miscellanea 


We find that when the probability of the right decision at the first step is 0-7995, 0-6206, 0-50, the 
total risks of error amount to roughly 0-006, 0-011 and 0-014 respectively, as against the maximum 
risk € = 0-05. 


" * 1] 
The numerical results have been graphically illustrated by drawing the graphs —log pU against Py 
as abscissa (for i = 1, 2, 3, oo). 


) 


My sincere thanks are due to Prof. M. S. Bartlett for suggesting to me this problem. 


REFERENCES 


BARTLETT, M. S. (1954). Problèmes de l'analyse spectrale des séries temporelles stationnaires. P ubl. 
Inst. Statist. (Univ. de Paris), 3, fasc. 3, p. 119. 


Menur, J. (1956). A note on the properties of a test procedure for discrimination between two types 
of spectra of a stationary stochastic process (to be published). 
SamvueEtson, P. A. (1948). Exact-distrib: 


ution of continuous variables in sequential analysis. Hcon?- 
metrica, 16, 2, p. 191. 


[ 235 ] 


CORRIGENDA 


(1) Gunnar BLOM. ‘Transformations of the binomial, negative binomial, Poisson 
and y? distributions’, Biometrika (1954), 41, 302. 
t NU to Mr A. R. Thatcher for drawing my attention to a computational slip 
(195 as led to three errors in Table 1 on p. 310 of my paper which appeared in Biometrika 
4). In the last column of the table, lower confidence limits are given, being roots of 


the e Hee 
quation in p a—}—np =A, Japa). (7-5) 


The entries for a = 5 should be 0-051, 0-038 and 0-030 instead of 0-002, 0-001 and <0 


n : ot 
‘spectively. As a consequence, the remark on the same page that (7-5) is remarkably 


Eoo for a = 5’ is not justified. 
a correction does not affect the recommendation on p. 310 that the inverse sine 
Sin a should be used instead of (7-5). It is seen from the corrected Table 1 and has also 
kie confirmed by extended calculations that the inverse sine formula, even without the 
c BA correction given in formula (7-3), provides more accurate values for confidence 
im of 90 % and more. For low confidence levels (7-5) is better, the difference between 
3 Ee being, however, small. These facts, together with the computational simplicity 
e inverse sine formula, lead to the recommendation mentioned above. 

B this connexion, it should be mentioned that the extended calculations have proved the 
E ative statement on p. 310 that formula (1:4) is preferable to (7:5) to be incorrect for small 
mole sizes, Tf, for example, n= 20 (7-5) is, in fact, more accurate. In view of the advantages 
h^ inverse sine formula referred to above, this comparison is, however, of little interest. 

b might be added that C. A. Bennett & N. L. Franklin, on p. 606 of their book 
tistical Analysis in Chemistry and the Chemical Industry (New York: Wiley; London: 
e pman Hall, 1954) give a quadratic expression due to Freeman & Tukey, which also 
Ovides confidence limits for à binomial probability. The roots of the quadratic can, with 


e : 3 
Notation used in my paper: be written 


p*g* , 4s _(g*—p* 
giri | En) tape! p*), 


Where Xx 
a= AA Cone 
_Fisher expansion (7-2) on 


the Cornish 
te than (7:2) for small sample sizes, even 


mula is used. The performance of 


Th; 
$i formula bears a certain resemblance to 
en of my paper, but it is much more accura 
t the skewness correction of the last-mentioned for i G cae 
(wit ennett, & Franklin formula is about the same as that of the inverse sine ormula 
T pont Skewness correction), the last formula being; however, 1n general slightly better. 
e ee fore, it seems unwarranted to use the method described by pd & Franklin, 
: iti id using a table of the i i 
hen it is considered to be an advantage to avoid using a table of the inverse sine 
(2) M. B. WIE, Biometrika (1955), 42, 70. 


P, 
74, Table 2, second line: A 


i 2 
for ange read i 2 


[ 236 ] 


REVIEWS 


Choice and Chance by Cardpack and Chessboard. Volume n. By LaxcELoT HoGBEN. 
London: Max Parrish and Co. Ltd. 1955. Pp. 466. 70s. 


This is the continuation of the first volume which app 


eared in 1950. The sub-title to this, as to the first, 
is ‘An Introduction to Probability by Visual Aids’ 


» but this, in the reviewer's opinion, is not ap 
write two text-books about statistical methods, 


more infrequent, and in this second volume Prof. Hog 
else, and begins with a chapter on expectation techniques. This is followed by bivariate models, 
simple analysis of variance, moments, sampling distributions, significance tests, regression, covariance 
and factor analysis, and an awkward chapter on sampling in a finite universe, The book is completed 


by a chapter entitled ‘Second Thoughts on Significance *, which reveals a certain naivety of outlook oP 
the part of the author. 


mes. They will fi 


position which will possibly 
already know. 


Decision Processes. Edited by R. 
York: John Wiley and 
Pp. viii-- 332. 40s, 


M. Tarar, C. H. Coomes and R. L. Davis. T 
Sons, Inc.; London: Chapman and Hall Ltd. 195% 


the summer of 1952 in Santa Monica, California. d 
imber of basic questions and perhaps also providing 89 


to say, however, 
will not find th N te 
Tidi leto Pda : e : rse style and self-contained nature of most of the chap 


ere 

to the theory of scales of measurement, we 

9 Social Choice’, ‘Learning Theory’, T the 
} ' Experimental Studies’, One of the most interesting papers ! 

7 B ege : i 

first Section discusses the validit © principles for choosing between alternative cous? 

(1) Wald’s min 


of all possible. 


na d a y 


— 


Reviews 237 


A good i i i 

B uo a ee fou ina highly critical spirit the notion of expected utility in connexion 
which ese T err ge ing these papers anyone who believed that rational decisions were those 
soverely Bs iine i utility, in any other than a purely tautological sense, would have his belief 
interesting a RI es oe section devoted to Learning Theory carries on the development of this most 
cians, eid iac id the theory of Conditional Probabilities which is, in the minds of most statisti- 
RAN im al Ay itl * t 5 name of Frederick Mosteller. The experimental studies represent perhaps the 
cvi; RAM a ook, in that the problems considered are extremely trivial. It is remarkable that 
the wee e made of the very rich sources of experimental material of this kind which exist in 
ane Ésibem b human activity which we call history, ina broad sense. But this may be being unfair to 
their plane: aic seems that, most, if not all, the participants in this seminar recognize that 
tion to real ae ae on processes must as yet be regarded as very remote from applica- 
G. A. BARNARD 


Sto : " 
pos Models for Learning. By R. R. Buse and F. MosrELLER. New York: 
n Wiley and Sons, Inc. ; London: Chapman and Hall Ltd. 1955. Pp. xvi+ 365. 72s. 


With 
the y " P 
usual type of learning experiment a subject has to make one of a number of alternative 


respo 

pw Eo each of a sequence of trials. Some of these responses are rewarded while others are punished, 

Proceed. T s «i said to occur if the rewarded responses are made with increasing frequency as the trials 

Ein oer. his book, Bush and Mosteller have attempted to describe such learning behaviour by 
probabliity models. A suitable model would appear to require that the probability of a 


Particul 
arr . A T : 3 
at the di n occurring at a given trial is some function of all the preceding responses; further, 
ean probability for the occurrence of the rewarded responses should increase, while that of 
In order to describe what is essentially à 


e H 
Siew responses decrease, with successive trials. 
Process, wh, Ty discrete time series, the authors have constructed their model as a kind of Markov 
for the Sete tho state of the system at each trial is defined in terms of a vector, giving the probabilities 
© form Se of each type of response. Theories of learning are not sufficiently explicit to dictate 
escribe iss such a process should take, and consequently the authors have merely sought to 
Senoticist, ir data in the most economical manner. This is in constrast with the physicist and the 
Intimate. ies are in the enviable position of being able to construct stochastie processes which are 
é bod t ated to extensive and precisely formulated theories. 
terms ofa k is divided into two main parts. In the first, a general stochastic process is developed in 
TOperties system of linear operators, together with the derivation of a number of its mathematical 
Tobabiliti In particular, recurrence formulae are given for the moments of the various response 
rocess to at the nth trial. The second part deals with the application of special cases of this general 
jects, n obtained from a number of learning experiments performed with animal and human 
arameter is in this latter half of the book that the authors deal with the problems of estimating the 
values and of assessing the goodness of fit of their models. These pro 


Ereat diffic 1 blems appear to be of 
ulty and the various attempts which Bush and Mosteller have made to overcome them, while 
ossibly encouraging fw 


Cem oc um UUEEEPREEP 1. 
(t oC EE fe CENE 1iiiLIÉ LLiA|. oào ada 
E^ 

2 —— OUTRE 


Mit: ed 
u plore of an ad hoc nature, are of value as p rther work in à relatively 
field of statistical theory. A. R. JONCKHEERE 


Int 

ro " Sagal a 

op nctlon to Demography. By MORTIMER SPIEGELMAN. Chicago, Tllinois: Society 
Actuaries, 1955. Pp. xxi+309. $6.00. 


Thi 
S bo, 
fo Ok was commissioned by the Socie 


RES 
P v ive, Clety's examinations. It is thus a parallel work 
Í 


ty of Actuaries for the use of students preparing themselves 
to that of P. R. Cox (Demography, Cambridge 
th the collection and analysis of British vital 
hasis on data arising in the United States and Canada. 
study of errors occurring in censuses, and 
rs have persisted longer, and to a greater 


va 
ae is aie: 1950). While Cox is mainly concerned wi 
oette is. & Diegelrriaim naturally places more emp ; 
Non vit or example, a more detailed and comprehensive 
E tal Statistics, than is given by Cox, since such erro 


e, i Me 
» in the American censuses than in those of Great Britain. . 
to the needs of actuarial students in that the items of necessary 


On the other hand, there tends to be the 


" f deep critical discussion. 
Meas quent to a discussion of demograp eir shortcomings there is a chapter on 
ality in different populations. The author's comparison 


es of mortality, and comparison of mort 


238 Reviews 


of ‘direct’ and ‘indirect’ adjusted rates seems to be less than fair to the latter. A conventional treat- 
ment of life-table construction and mortality projections rounds off the treatment of mortality. 

There follows a brief chapter, too brief in the opinion of the reviewer, on morbidity statistics and & 
treatment of family composition and the various indices of fertility and reproduction. The next chapter 
deals with the distribution of the population, and internal and external migration. Internal migration 
is considered rather cursorily and recent British work on the balance of internal migration is not 
mentioned. The following chapter, on ‘The Working Population’, gives an interesting and clear 
analysis of the structure of the working population of the United States, based on published material. 
The final chapter, on population estimates and projections, gives a fair statement of the problems and 
methods of projection, without, perhaps, sufficient cautionary advice on the value of these projections 
when calculated. 

To sum up, this is a good text-book of real value to students, though it breaks little significantly 
fresh ground in the demographic field. N. L. JOHNSON 


Mortality and Other Investigations. Volume r. By H. W. Havcocks and W. PERES- 
Cambridge University Press. 1955. Pp. ix+164. 20s. 


This book is one of a series written for the post-war syllabus of the Institute of Actuaries and forms the 
first introduction to the more practical subjects included in the actuarial syllabus. The subjects 
covered are the compilation of mortality and sickness rates, some discussion of the National Life Tables 
and elementary graduation such as the graphic and finite difference methods including, however 
osculatory interpolation. More advanced topics such as selection, continuous exposed to risk formulae 
and multiple decrement tables are to be dealt with in a second volume. The bulk of the book is devote 
to the calculation of the ordinary rate of mortality from life office data using the policy year, life yore 
and calendar year methods. The treatment is both meticulous and exhaustive. The underlying notion 
used is that of risk time and the various formulae are built up by combining the appropriate portions S 
risk-time. This part of the book seems, however, to suffer from a complete lack of illustrative examp. 1 
which would probably drive the points home to the average student much more quickly. There is also 
a missed opportunity here in that the straightforward application of such methods to absenteeism 
breakdown of vehicles and so on is nowhere mentioned. " 
ls mainly with graduation. After the graphie method has bes 
cal tests for the observed data are given. For the X^ test the state 
d x?-total is, of course, the number of individual values of X?" d 
er surprising unless arbitrary mortality rates are being used, ME. 
where they have been obtained by a graduation. The authors apP® 
they then go on to describe cireumstances under which the degrees ? 


labus. 


" re 
, including selection, to be grouped togethe 


Numerical Methods. By Anprew D. Boorn. London: Butterworth’s Scientific? 
Publications. 1955. Pp. vii+195. 35s. 
the 


SA ee of automatic digital computers has in recent years given remarkable impetus to aly 
study of numerical analysis. This book provides a short, lucid survey of an extensive and rap" 

growing field. Treated are: interpolation, numerical differentiation and integration. summation io 
series, ordinary and partial differential equations, simultaneous linear equations non-linear algebra 
equations, approximating functions i à i 


L^ adi 


Reviews 239 


ons provides ascanty treatment, with little special mention, for example, of simplified procedures 
n Symmetric matrices. There is some account of the condition of matrices, but little general discussion 
errors, more than ever important in lengthy machine computations. No exercises are provided. 
lem analysis has not yet become a standard subject in undergraduate mathematical courses. 
ES erefore interesting to note that the present book is based on a course of lectures given by the 
ba € 2 final honours students at Birkbeck College. It should be helpful for teaching the basic 
sem a. ced those who have a problem to solve and require a fuller treatment of some particular topic 
useful key references included in the text. : F. G. FOSTER 


Numerical Analysis. By Z. Korat. London: Chapman and Hall Ltd. 1955. 
Pp. xiv+556. 63s. 


m book covers many aspects of numerical analysis and in particular such topics as interpolation, 
Bes differentiation and quadrature, differential equations and boundary value problems. The 
on of writing is clear and easy to follow yet no vigour is lost. The standard of mathematics assumed 
ny sponds roughly with pure mathematics in the Advanced Level of the General Certificate of 
ia aon, and new concepts are fully explained as and when they occur. The author does not 
Bor ‘ate to use numerical examples to illustrate and drive home his various points if he feels that it is 

er than a mass of symbols by themselves, and this enhances the value of a book designed to illustrate 


numerical me 
thods. 
algebraic methods following from the Lagrangian form 


he proof of interpolation formulae utilizes y 
Ormulae where n + 1 points of a function are used to fix a polynomial of order n. Alternative deri- 


Vations of the main formulae using the quick and neat operational form are given in an appendix. 
pe end of each chapter there are some brief bibliographical notes which should enable anyone 
an her to get a good start. There are also plenty of examples for the 

ty—some being straightforward applications of 
+h problems. ‘Answers are not given, which is a pity in 


Co) ^ 

"s plete the book a discussion of the inversion of matrices anc s 
aia and Everett's interpolation coefficients might have been included. These, though, are minor 
"isms of what is a very thorough and well-written text-book which should be of use to numerous 
P. G. MOORE 


Plieq scientists. 


ariable. Edited by WILFRED KAPLAN. 


Le 
Ctures on Functions of a Complex V 
955. Pp. v-+433. $10. 


Michigan: University of Michigan Press. 1 
ctures on recent de c ] 
e held for this purpose at the Universi 


introductory outlines of different parts of the subject, to 


emat; " E P 
anq atical journals. A considerable proportion of tas 
j ? m pasrpingn Dems variety of Hm i n is bis and the statistical relevance 
ig y, 5 a book for the analyst who specializes in certain parts of his subject, 


by 5Y indirect. Thus the contributions on * The Distrib 


ution of Zeros of a Polynomial’, * Approximation 
? i zti i *, etc., ma; well be of assistance in 
Jogo,  "10mials', ‘Expansion Theorems for Analytic Functions y 


but they could hardly be called statistically 


Search, . ytic 
impe m into the mathematical functions of statistics 
iR 3 : n H 
d isi ber of the papers which deal (in their 
Appii? to i urp ly large num J h 
eg licati 200 ponit pernan eE epo. des functions. For in view of the mathematical 


Ons at | it ic and other pot 
east) with harmonii PAN thes 


Vag Alene, ete ; 
t cop ^9 OF distributions of mass; charge, un : 
tatistical fruit. 
Th Oed potential theory has nob Dare a ab tf whether the publication of such a hetero- 


e 
: iti o 
Reneo, Ok, however, lacks an index and it Loy serves any useful purpose not already catered for 


Sug 
Y ex. ^ Collection of e in quality and Er. 
p Ing B&clodioso end isl The price is prohibitive. D. E. BARTON 


e of probability, it is remarkable that the 


240 Reviews 


OTHER BOOKS RECEIVED 


... Introducción a los Metodos de la Estadística. Segunda Parte. By Srxro Raos. 
Madrid: Instituto de Investigaciones Estadisticas del Consejo Superior de Investi- 
gaciones Científicas. 1954. Pp. 241. 


. Anthropometry and Human Engineering, from a Symposium held in The 
Netherlands in May 1954. London: Butterworth's Scientific Publications (on behalf 


of the Advisory Group for Aeronautical Research and Development, N.A.T.O.). 
1955. Pp. 123. 21s. 


- Mathematical Models of Human Behavior. Proc. Symposium Dunlap and 


Associates Inc. Stanford, Connecticut: Dunlap and Associates Ine. 1955. PP- 
vii+ 103. 


. Nonmetric Factor Analysis. By C. H. Coo 


MBS and R. C. Kao. University of 
Michigan Press. 1955. Pp. 63. $2. 


- Graphical Method of Statistical I 


nference. By M. Masuyama. Tokyo: Maruzen 
Co. Ltd. 1953. Pp. $3. 


BIOMETRIKA PUBLICATIONS: BOOKS OF TABLES 


Issued by the Cambridge University Press, Bentley House, London, N.W.1 
and obtainable from any bookseller i 


Tables of the Incomplete B-Function 
EDITED BY KARL PEARSON 
59 pages of Introduction and 494 pages of Tables 


Price: 55s. net 


Tables of the Incomplete r-Function 
EDITED BY KARL PEARSON 


31 pages of Introduction and 164 pages of Tables 


Price: 425. net 


T 
* Tables of the Complete and Incomplete Elliptic Integrals 


(from 
LEGENDRE’s Traité des Fonctions Elliptiques. With autograp 


oduction by KARL PEARSON and 94 pages of Tables 


39 pages of Intr 


Price: 12s. 6d. net 


lity Integral of the 
nt in Small Samples 


Tables of the Ordinates and Probabi 


Istrj A è " 
tribution of the Correlation Coefficie 


By F.N. DAVID 
s of Tables, 10 Diagrams and 4 Charts 


38 pages of Introduction, 55 Page 


Price: 175. 6d. net 
UBLICATION NOW A VAILABLE 
ables for Statisticians, Vol. I 


ns are now out of print a 


NEW P 


Biometrika T 


Q9 volumes of Tables for Statisticians and Biometricia 
~ 
5. 2€ request of the Biometrika recasting of these Tables sos 1 
Ofes .S. .O. : d tables haye ien set asi 
j Bignetninitaxe reproduced zhd 


e : isti 
I of the new series, which includes the statistical 
duction and 


Use is now available. It contains an Intro 


Price: 255. net 


s 


c 


NEW STATISTICAL TABLES: SEPARATES RE-ISSUED 
FROM BIOMETRIKA 


To be obtained from 


BIOMETRIKA OFFICE, UNIVERSITY COLLEGE, LONDON, W.C.1 


l. From Biometrika, Vols. 22, 27 and 28 
Tests of Normality. By E. S. PEARSON and R. C. GEARY 


Price Two Shillings and Sixpence, post free 


ll. From Biometrika, Vol. 32, Part 2, pp. 168-181 and 188-189 


(1) Table of percentage points of the incomplete beta-function 
(2) Table of percentage points of the x? distribution 


Stitched together with introductory matter. Price Two Shillings and Sixpence, post free 


Ill. From Biometrika, Vol. 32, Parts 3 and 4, pp. 300-310 


(1) Table of the probability integral of the range in samples from a normal population 
(2) Table of the percentage Points of the range 
(3) Table of the percentage Points of the t- 


distribution 
Stitched together with introductory matter. Price Two Shillings and Sixpence, 
IV. From Biometrika, Vol. 33, Part 1, pp. 73-88 
Table of percentage points of the inverted beta (F) distribution 


With introductory matter. Price Two Shillings and Sixpence, 


V. From Biometrika, Vol. 33, Part 3, pp. 252-265 


mal 
(1) Table of the Probability integral of the mean deviation in samples from a nor 
Population 


post free 


post free 


(2) Table of the percentage points of the mean deviation 
Stitched together with introductory matter. 


Price Two Shillings and Sixpence, post free 
VI. From Biometrika, Vol. 33, Part 4, pp. 296-304 
Table for testing the homogeneity of a set of estimated variances 
With introductory matter. Price Two Shillings, 
VII. From Biometrika, Vol. 35, Parts 1 and 2, pp. 145-156 


ingencY 
Table of significance levels for the Fisher-Yates test of significance in 2x2 conting $ 
tables. By D. J. FINNEY 


post free 


With introductory matter, 


VIII. From Biometrika, Vol. 35, Parts 4 and 2, pp. 191-201 
Table for the calcula 


tion of working probits and weights in probit analysis. 
and W. L. STEVENS 


Price Two Shillings and Sixpence, post free 


With introductory matter. 


IX. From Biometrika, Vol. 36, P. 


Tables of autoregressive serie. 


Price Two Shillings and Sixpence, post free N 


arts 3 and 4, pp. 267-289 
s. By M. G. KENDALL 
With introductory matter. Price Two Shillings and Sixpence, 


X. From Biometrika, Vol. 36, Parts 3 and 4, pp. 431—449 
Tables of symmetric functions, Part |, By F. N. DAVID and M. G, KENDALL 


Price Two Shillings and Sixpence, post free 


post free 


With introductory matter, 


NEW STATISTICAL TABLES: continued 


XII. From Biometrika, Vol. 37, pp. 168-172 and pp. 313-325 
i oe of the probability integral of the t-distribution 
able of the y? i H a — 
lus 7 tbe integral and of the cumulative Poisson distribution. By H. O. HARTLEY 


Stitched together with introductory matter. Price Five Shillings, post free 


XI ; ; 
M From Biometrika, Vol. 38, Parts 1 and 2, pp. 112-130 
arts of the power fi i i i i 
Fedistei 1 power unction for analysis of variance tests derived from th le 
distribution. By E. S. PEARSON and H. O. HARTLEY $ e mongers 
With introductory matter. Price Two Shillings and Sixpence, post free 


XI i 

u _ From Biometrika, Vol. 38, Parts 3 and 4, pp. 435-462 

ables of symmetric functions. Parts Il and III. By F. N. DAVID and M. G. KENDALL 
With introductory matter. Price Four Shillings, post free 


XV " 
Eom Biometrika, Vol. 38, Parts 3 and 4, pp. 423-426 
art for the incomplete beta-function and the cumulative binomial distributi 
HARTLEY and ER ampi i istribution. By H. O. 
With introductory matter and ruler scale. Price Two Shill 


X 
VI. From Biometrika, Vol. 40, Parts 1 and 2, pp. 70-73 


T 
ables of the angular transformation. By W. L. STEVENS 
With introductory matter. Price One Shilling, post free 


X 
VII. From Biometrika, Vol. 40, Parts 1 and 2, pp. 74-86 


Tests of significance in a 2x2 contingency table: extension of Finney’s table (No. Vil). 
mputed by R. LATSCHA 
XVI With introductory matter. Price Tw 
meron Biometrika, Vol. 40, Parts 3 and 4, pp- 427-446 
es of symmetric functions. Part IV. By F. N. DAVID and M. G. KENDALL 
With introductory matter. Price Four Shillings, post free 


XI ; 
X. From Biometrika, Vol. 41, Parts 4 and 2, pp. 253-260 


a 
bles of generalized k-statistics. By S- H. ABDEL-ATY 
With introductory matter. Price Two Shillings, post free 


XX : 
- From Biometrika, Vol. 42, Parts 1 and 2, pP- 223-242 
Part V. By F. N. DAVID and M. G. KENDALL 


ce Four Shillings, Post free 


ings and Sixpence, post free 


o Shillings and Sixpence, post free 


abl 
es of symmetric functions. 
With introductory matter. Pri 


XXI 

- From Biometrika, Vol. 42, Parts 3 and 4, pp. 494-511 

New form of table for significance tests ina 2x2 contingency table. By P. ARMSEN 
With introductory matter. Price Two Shillings and Sixpence, post free 


No. XI is out of print. 


Biometrika Index 
ect Index for Volumes 1-37 and 


A Biometrika Index comprising Subj 
1-40 is now available 


Author Index for Volumes 
Price: 6s. net or $1.00 


To be obtained from 


'OMETRIKA OFFICE, UNIVERSITY COLLEGE, LONDON, W.C.1 


M ———— —————CCÓÀ 


BIOMETRIKA PUBLICATIONS 


Issued by the Cambridge University Press, Bentley House, London, N.W.1 
and obtainable from any bookseller 


The Life, Letters and Labours of Francis Galton, Vols. I, II, Tila, & nr 
ByKARL PEARSON, F.R.S. Price £3. 55. 


Karl Pearson: An Appreciation of Some Aspects of his Life un Wok. 
rice 
By E. S. PEARSON 


. ye eye n 
A Bibliography of the Statistical and Other Writings of Karl Pesne 
Compiled by G. M. MORANT, with the assistance of B. L. WELCH Price 6s. 


* Student's? Collected Papers Edited by E. S. PEARSON and JOHN WISH ems 
with a Foreword by LAUNCE McMULLEN Price 155. 


Karl Pearson's Early Statistical Papers 


^ e 

Reprinted by photo-lithography for the Biometrika Trust, with the permission of the original publishers 1^ 

Volume contains eleven papers, including the more important of the memoirs entitled “Mathematical 

butions to the Theory of Evolution”, first published in the Philosophical Transactions of the Royal perii j 

original paper deriving the x?-distribution, published in 1900 in the Philosophical Magazine, is ys em net 
rici E 


APPLIED STATISTICS 


A JOURNAL OF THE ROYAL STATISTICAL SOCIETY 


Editor DONALD G. BEECH Assistant Editor BERNARD T. RAMM 


Contents of Vol. V, No. 3, November 1956 
Statistics in Hospital Planning and Design. Norman T. J. BAILEY 


Monte Carlo Methods and Industrial Problems. W. NEIL Jessop 

The Allocation of Scarce Materials between Products. NOEL WILLIAMS, F.L.A., A.C.W.A. 
Industrial Censuses in Great Britain and the United States. Rrra J. MAURICE 

A Technique for Studying the Effects of a Television Broadcast. WILLIAM A. BELSON 


Missing Values in Experiments Analysed on Automatic Computers. MicHAEL HrALY and Mi 
WESTMACOTT 


A Comment on the Construction of Price Index Numbers. K. S. BANERJEE 
NOTES AND COMMENTS 


MEETINGS OF SECTIONS OF THE ROYAL STATISTICAL SOCIETY 
Book REVIEWS 


The price of this part is 10s. 


j ; ; ; vim ill be 
The journal is published three times each year. From the beginning of 1957 the annual subscription will 
30s. or $5.00 post free. Orders should be sent to 


OLIVER AND BOYD: Tweeddale Court, Edinburgh, 1 


(iv) 


BIOMETRICS 


Journal of the Biometric Society 


iU. 12, No.3 TABLE OF CONTENTS SEPTEMBER, 1956 
re to missing plot techniques. H. R. THOMPSON—The analysis of a 3 x 6 experiment arranged in 
est eem square. G. E. Hopnett—A note on the 4^ series of factorial experiments. P. J. CLARING- 
I ro small sample tests of significance for a Poisson distribution. C. RADHAKRISHNA Rao and 
ape HAKRAVARTI—Matched pairs in sequential trials for significance of a difference between propor- 
the s. W.Z. BILLEWICZ—A note on the rank analysis of incomplete block designs-Applications beyond 
ine SopS of existing tables. OTTO DvksrRA—Extension of multiple range tests to group means with 
inu numbers of replications. C Ys KRrAMER—Simplified LD; (or EDs) calculations. Henry J. 
S N—A simple method for fitting an asymptotic regression curve. H. D. ParreRsoN—Campinas 
Se gee paper: Some problems of experimental design and technique with perennial crops. S, G. 
Vor. 12, No. 4 DECEMBER, 1956 


Theoretical and experimental study of self fertilized populations. T. W. Horner and C. R. WEBER— 
G rankit analysis of paired comparisons for measuring the effect of sprays on flavor. C. I. Briss, M. L. 
REENWOOD and E. S. WHITE—The discrimination of interactions and linkage in continuous variation. 


BIRGER Orsanr—The statistical analysis of a complex experiment involving unintentional restraints. 
ur Finney and F. W. Copr—Simplified analysis of singly linked blocks. K. R. NAIR—Applications of 
5 4 Statistic to genetic variance component analyses. D. S. RonssoN—Note on Wald s method of fitting 
straight line when both variables are subject to error. E. S. KEEprInc—Campinas Symposium papers: 
cent advances in biometry in Japan, Ms MASUYAMA and M. HATAMURA— Control of errors in surveys. 
- H. HaNsEN and J. STEINBERG—The study of physiological effects of hot climates. J. O. IRwiN— 


Confidence limits for measuring the precision of bioassays. C. I. BLISS. 


For American Statistical Association Members, 
iation or the Biometric Society, 


eal subscription rates to non-members are as follows: Ar 
$ mg for subscribers, non-members of either American Statistical Assoc 
00. “Subscriptions should be sent to the 

MANAGING EDITOR, BIOMETRICS 


NATIONAL RESEARCH COUNCIL, OTTAWA 2, CANADA 


TRABAJOS DE ESTADISTICA 


REVIEW PUBLISHED BY INSTITUTO DE INVESTIGACIONES ESTADÍSTICAS 
OR DE INVESTIGACIONES CIENTÍFICAS 


OF THE CONSEJO SUPERI 
MADRID, SPAIN 


Vol. vry CONTENTS Cuad. 1 
; SOM "m uencial. - 
à din Rost no paramita Lear adio de la evolución y relación de medidas antropometricas 


* JASO, J, Besar, A. ARBELO Y S. Riıos— Estu 


en lon E 
Noras S niños menores de un ano. 


Programación lineal. 


Teoría Económica. Aplicaciones de la 


+ CASTAR 
TAÑEDA— L: ión lineal y la 
—La programación 1 
ONICAS, BIBLIOGRAFIA. CUESTIONES Y EJERCICIOS. 
Cuad. II 


ramación lineal. . 
E les de Poisson. 


à iab 
I algebraica de varia. sson. | . 

3 ee istributi i i ir role in the theory of Wilner’s stochastic variables. 

A B KO— Perks’ distributions and the "died n 

Noras ENNETT— Note on the Poisson Index o 


* Ríos. ws . tigación operativa. 
3 eee y problemas de Mt investigación acon lineal. 
Q SAN Juan—EI método del ‘simplex’ en J2 PEE 


Cro NALZ— Inspección de materiales fabricados. 


ONIC EJERCICIOS. 

A. B ones Y E! 
I CUESTI P , " 
BLIOGRAFIA- write to Professor Sixto Rios, Instituto de 


Or g s and subscription WILS = Ans ( 
qe tis thing in connection with works, exchange and Nestigaciones Cientificas (Serrano, VER sone Spain. 
ide cop ones Estadísticas of the Consejo “ iplished three times a Year bout 2 pages), 

Pesetas is composed of three fascicles D, 00 U.S.A- for all other countries. 
às for Spain and South America and $4. 


Annals of Human Genetics 
Formerly ANNALS OF EUGENICS 
Edited by L. S. PENROSE 


Fol 21,.PL. 1 CONTENTS August 1956 


3 à j f 

i oglobin differences: problems and perspectives. J. V. NeeL—The relation of 
Lee ee eae colour. N. A. Barnicot—The inheritance of muscular dystopi 
further observations. J. N. WALTON—Data on linkage in man: hereditary deafness in Northern Tre! ane. 
E. A. CHEESEMAN, CAROL M. E. CROZIER and J. D. MERRETT—The sickle cell and haemoglobin C gens 
in some African populations. A. C. ALLISON—On the stability of allelic systems, with special reference o 
haemoglobin 4, S and C. L. S. PENROSE, SHEILA M. SmiTH and D. A. SPRorr— Data on linkage in man: 
P. T. C. tasting and some dermatoglyphic traits. J. PoNs—Reviews. 


Vol. 21, Pi. 2 December 1956 


Sex-linked ocular albinism in a Dutch family. J. VAN DEN BOSCH and P. J. WAARDENBURG— The Ti 
ship between parental age, birth order and the secondary sex ratio in humans. E. Novirskt an 2 
SANDLER—The retinae of monovular and binovular twins. R. PLATT and R. Lawton—Blood groups ba 
Persian Jews. J. GUREVITCH, E. Hasson and E. MancoLis—Heredity and rheumatic fever. A. C. STEVE a 
SON and E. A. CHEESEMAN—Effect of migration on some genetical characters in six endogamous groupa 
in India. SATYAVATI M. SiRsAT—Colour-blindness and the Duchenne type muscular dystrophy. CE 
Pare and J. N. WALTON (with a note on the estimation of linkage. C. A. B. SmarH)—Nail-pate e 
syndrome: evidence for modification by alleles at the main locus. J. H. Renwick—Hair colour in t 
infantile Fanconi syndrome. VALERIE CowrE—Sib correlations of weight and rate of growth and relatio 
between weight and growth rate over the first five years. Mary N. KARN—REVIEW. 


Subscription price 80s. net per volume of four quarterly parts (in U.S.A. $13.50) post free. 
Single issues, 25s. (in U.S.A. $4.50) postage extra. 


CAMBRIDGE UNIVERSITY PRESS 
BENTLEY HOUSE, 200 EUSTON ROAD, LONDON, N.W. 1 


The Annals of Mathematical Statistics 


The Official Fournal of the Institute of Mathematical Statistics 
VOL. 27, NO. 4 CONTENTS DECEMBER, 1956 


Subscription rate $12.00 per year in the United States and Canada and $10.00 per year elsewhere 


ADDRESS ORDERS FOR SUBSCRIPTIONS AND BACK NUMBERS TO 
PROFESSOR GEORGE E, NICH 


OLSON, JR., SECRETARY, INSTITUTE OF 
MATHEMATICAL STATISTICS, DEPARTMENT OF STATISTICS, 
UNIVERSITY OF NORTH CAROLINA, CHAPEL HILL, NOR TH CAROLINA 


(vi) 


M — 


St 
atisti 
Stical Publishing Society, 


ECONOMETRICA 


JOURNAL OF THE ECONOMETRIC SOCIETY 


Contents of Vol. 24, No. 4, October 1956, include: 


R. Resource Allocation for Economic Development 
Aggregation Problem of Input-Output Analysis 
f Linear Programming to Competitive Bond Bidding 


item B. CusNERY and KENNETH S. KRETSCHME 
sake NE A Fundamental Theorem for the 
PI. Ve . PERCUS and LEON Quinto. Application o ear ] 
Eee Complementarity and Long-Range Projections 
Conta — An Eclectic Approach to the Pure Theory of Consumer Behavior 
LD. SA . BLYTH. The Theory of Capital and Its Time Measures 
n -— A Note on Mr Blyth's Article er 
E. esi McManus. On Hatanaka’s Note on Consolidation 
Bene R. ER. On the Stability of Certain Economic Systems 

EVIEWS, NOTES AND ANNOUNCEMENTS 


Published Quarterly 


The . z 
iei conometrie Society is an internation: 
nai to statistics and mathematics. 
ina coe to Econometrica and inquiries about the work of the 
pplying for membership should be addressed to 


RICHARD RUGGLES, Secretary 


THE ECONOMETRIC SOCIETY, BOX 1264, YALE UNIVERSITY 
NEW HAVEN, CONNECTICUT, U.S.A. 


Subscription rate available on request 


al society for the advancement of economic theory in its 


Society and the procedure 


SANKHYA 


OURNAL OF STATISTICS 
HALANOBIS 


THE INDIAN J 
EDITED BY p. C. MA 


Vol 
* 17, part 2, 1956 CONTENTS 


cin Recovery of Inter-block Information in Varietal Tri 

n the Bia and Analysis of Linked Block Designs. BY J. 

"actio, ual of a PBIB Design and a New Class of Designs with à 
Ou Replication in Asymmetrical Factorial Designs an 
EN RAVARTI 

Two i Class of Quasifactorial and Ri 

me ^ poate Partially Balanced Designs Involving Three Re is 
The c, eries of Balanced Incomplete Block Designs. By D. A SPRO 
A y cept of Asymptotic Efficiency- By D. BASU 


als. By C. RADHAKRISHNA RAO 


Roy and R. G. LAHA 
Two Replications. By C. S. RAMAKRISHNA! 
Partially Balanced Arrays. By I. M 


elated Designs. By C. RADHAKRISHNA Rao 
plications. By J. Roy and R, G. LAHA 


thout Replacement. By Drs R. 


Ote on the Determination of Optimum Probabilities in Sampling wi 
SUBSCRI CURRENT BACK NUMBERS 
PTIO UR . 
i volume per issue per volame ET issue 
MA Pg. 30 LR Rs 00 P) 
FongiGN $10.00 $3.50 $15. 


Subscriptions and orders for back numbers should be sent to 


204/1 Barrackpore Trunk Road, Calcutta - 


ROYAL STATISTICAL SOCIETY 


THE JOURNAL OF THE ROYAL STATISTICAL SOCIETY is published in two series: SERIES A 
dux . B 
(GENERAL), four issues a year, 15s. each part, annual subscription £3. 1s. post free; SERIES 
ipti st 
(METHODOLOGICAL), two issues a year, 22s. 6d. each part, annual subscription 45s. 6d. po 
free. 


SERIES A (GENERAL), VOL. 119, PART 3, 1956 


United Kingdom Indices of Wholesale Prices, 1949-1955, By H. S. PuiLLiPs. (With Discussion.) 
The Television Audience in the United Kingdom. By B. P. EMMETT. (With Discussion.) 


Some Theory of Index Numbers. By You Pou SENG. 
A. Geometrical Derivation of the Analyses of Covariance and Variance. By C. P. Cox. 
Note on the History of Sampling Methods in Russia, By S. S. Zankovic, 


REVIEWS OF Books, STATISTICAL NOTES, ADDITIONS TO LiBRARY, PERIODICAL RETURNS. 


SERIES B (METHODOLOGICAL), VOL. 18, No. 1, 1956 


Some Tests of Significance with Ordered Variables. By F. N. Davip and N. L. JOHNSON. (With Discussion.) 


Economic Choice of the Amount of Ex 


ES. 
perimentation. By P. M. GRUNDY, M. J. R. HEALY and D. H. RE 
(With Discussion.) 


On a Test of Significance in Pearson's Biometrika Table (No. 11). By Sir RoNALD FISHER. 


A Test of Significance for an Unidentifiable Relation. By P. A. P. MORAN. 


Confidence Limits for the Gradient in the Linear Functional Relationship. By Monica A. CREASY. 


Algebraic Theory of the Computin 


. Eam " E " rmal 
£ Routine for Tests of Significance on the Dimensionality of No 
Multivariate Systems. 


By F. E. BiNET and G. S. WATSON. 
Some Notes on Ordered Random Intervals, By D. E. BARTON and F. N. DAVID. 
A Sequential Test for Randomness. By D. J. BARTHOLOMEW., 


Partitions in More Than One Dimension. By J. H. BENNETT. 


On the Estimation of Small Frequencies in Contingency Tables. By I. J. GooD. 


An Elementary Method of Solution of the 


" ara- 
Queueing Problem with a Single Server and Constant P: 
meters, 


By D. G. CHAMPERNOWNE, 


Random Queueing Processes with Phase-type Service. By R. R. P. JACKSON. 


The Within-Animal Bioassay with Quantal Responses, By P. J. CLARINGBOLD. 


ROYAL STATISTICAL SOCIETY, 21 BENTINCK STREET, LONDON, w.1 


(viii) 


VotumE 43, PARTS 3 AND 4 DECEMBER 1956 
R 


STU S 
DIES IN THE HISTORY OF PROBABILITY AND STATISTICS 
IL AN : (T N 

A NOTE ON THE HISTORY OF THE GRAPHICAL PRESENTATION OF DATA 


By ERICA ROYSTON 


ivision I h f onomacs and. Political 7 
D o! : ` 
of Research echniques, London Sc hool of Ec mi 7 liti Science 


I 
Eu v of the business man despondently watching his production curve dis- 
Cog € i he chart on the wall and down through the floor into the office below appears 
Penhi oc Š to be a joke that has worn slightly thin. Nevertheless, the very fact that 
eme into "d sentation can become the subject of a cartoon shows how completely it has 
most oeei yday usage. The graph is now generally accepted as one of the clearest and 
Specialist. Ma ways of presenting data, whether for consumption by the layman or the 
approprist any volumes have been published on the art of graphical presentation, on the 
e method to be used for any specific purpose and on the advantages and dis- 


advant; s 
ages of various approaches to the subject. It seems strange therefore that so little 


Curiosit; : 
y has been displayed regarding the historical origins of the graphical representation 


9 data ES 
-* In fact, these historical origins are not at all clear. 


2 
inen E other historical studies of this nature it is virtually impossible to state 
time, nad hat a given method was introduced by any one person or at any one moment 
A certain meis that one may hope to do is to show that such a method was being used at 
e and to investigate the author's claims, if any, to originality. Apart from this, 
d back to their possible origins. 
on of data will be taken to mean 
as distinct from the graphical 
ment of the technique of using 
to date back only a few cen- 


Som 
© of t s 
he underlying concepts can be trace 


3. For 

cg E the purpose of this note the graphical representati 

Dlottin metric and graphical presentation of factual data 
j 8 of mathematical functions. Although the develop 


artesi 
Sae, pp ordines with one axis representing time seems t only 
t moy, Spatial representation of time originated much earlier. There exist instances of 
Plotteg ian of stars, or, more precisely, the inclinations of the planetary orbits being 
Mergone, a function of time, as early as the tenth century (Funkhouser, 1936), and the 
€ of written music represents one of the earliest instances of the use of time-series. f 
n of a point in space dates 
f Descartes that mathe- 
f tracing the origins of 
athematical functions 


4. T 
ba t he basic idea of using co-ordinates to determine the locatio: 
o : a 
* to the Greeks at least, although it was not until the time 0 


leia; 
ns ; ü 
"aphic S systematically developed the idea. For the purpose o 
a. D ; 
*w l representation the emergence of the concept of plotting m 
a hi : 
an ee holding the part-time appointment of Professor of Geometry at Gresham College (1890-4) 
shops; On. gave a series of twelve Ie the heading "The Geometry of Statistics’. The 
Se ae survives refers to Playfair her of the subject but the lectures themselves 
o 
ave been lost. Se S ^ 
. See E. S. Pearson zm 
vevpara), the basis of presen tes, were at first only approximately 
: a ir 
rms of time. The duration of notes wast Uus tw ME thir irons 
Wh, he ee the so-called Franconian € Ars Cantus Mensurabilis) 
à may Together with the gradual intro n the bar, music became 
n M is ter £ i ising the tone as 
€ agai ermed a true time-series, W 
inst the time as abscissa. 


etures under 
as the fat 
(1938, P- 142.) 
t-day musical no t 
not fixed precisely until 


Franco of Cologne’s 
par and a fixed beati 
ere graphically presented, u 


»form (after 
juetion of the 
here notes W 
16 | 

Biom. 43 


242 Studies in the history of probability and statistics. III 


i ies is, i i ore 
of the type y=f(x) is important. The most usual form of time-series is, in fact, es S 
than this, except that usually some form of frequency is plotted as an observed, no € i 
sarily strictly mathematical, function of time. Cartesian co-ordinates are therefore es i 
the foundations on which the modern graph of data is based. It is interesting to se 
passing that William Playfair (1801) justifies plotting money against time as follows: 
i has 
i i trical measurement 
i hod has struck several persons as being fallacious because geome ne sE 
M e xara n case a or to time yet here it is made to represent both. The most orci dens 
sim: miim to this objection is that if the money received by a single man in trade were à Pat 
m come evening he made a single pile of all the guineas received during the day, its heigh 


: = 2 " unt 
be proportioned to the receipts of that day, so that by this plain operation time, proportion and amo 
would be physically combined. 


P RR om ; the 

As will be shown, however, this picturesque description is merely an explanation to. M 
layman, for Playfair himself almost certainly approached the subject as an extensior 
the use of rectangular co-ordinates, which by his time were familiar in mathematics. 


5. One of the early users of graphical representation in statistics was A. F. W. Crome- g 
was born in Germany in 1753, the third of twenty children. His father was a e 
and, following in his footsteps, Crome studied theology. During his studies he acte = 
tutor to the children of General von Holzendorf and later to those of Karl Alexander T 
Bismarck. He passed his examinations in 1771 and in 1783 became lecturer in geogr@P S 
and history at Dessau. In 1783 he became a tutor to the 16-year-old Prince of Dessau; "t 
finally in 1786 he was made ordinary Professor of Statistics and Public Finance at Gies? 
University, where he remained till he retired in 1831. He died two years later. 


6. With an academic background such as this it is hardly surprising to learn that Orgi 
first evolved his system of geometric representation as an aid to teaching. Crome b 
modern terminology, an economic geographer rather than a statistician. His vm à 
descriptive rather than analytical. Although the Allgemeine Deutsche Biografie calls wi 
‘pioneer in statistics’, it is probably using the word in its original meaning, i.e. wn 
referring to states. Thus the adjective ‘statistical’, which appears in nearly all the 
of his works, e.g. Geographisch-statistische Darstellung der Staatskrüfte, merely ("€ 
indicate that some numerical data are given. Thus, for example, in Über die Grösse tion 
Bevölkerung der sämtlichen Europäischen Staaten (1785), Crome gives a detailed de 
ofthe geographical ‘vital data’ of most European states, chiefly population figures anda 310, 
To make the importance of such data clearer he devised his Grössen Karte which antis? 
much the same type as that illustrated in Fig. 1. (The latter is taken from a similar tre 
dealing with the German states.) 


. * ` : tion L 
He justifies the use of this geometrical representa 
follows (1785): 
If therefore one does not 
cities, provinces, 
might of the vari 


mes of the 


cat jn an 
litio! ture 


f the are? 


ge of geography to knowing the na 


0 
^ -e they 9^; 

d nt sizes can however be more easily seen and grasped i£% an if 
brought before the eye in the form of a drawing, because the imagination is thus stimulated, | ofte” 


"m 18 
! f numbers, especially when these consist of many digits 99 
the case with areas of states... . 


Erica ROYSTON 243 


E. ion € — employed in these charts was simple. He expressed the area of 

E V hes ates with which he was dealing asa square proportional to that area. He 

the idi m ^ inn inside each other in such a manner as to keep the vertical sides of 

Eu sedi 3s 1 are of course proportional to the square root of the area represented, in 

tints a she area of a state was relatively much larger than the others he broke the 
1 the same manner as is done on modern graphs. 


8. ; . € 

~~ The idea of representing data in this way seems to have been thought completely new, 

wh rome went into great detail in describing the uses and misuses of his charts. For 
mple, he warns his readers that there is no justification for assuming that, because the 


Squares wi Mn 
es were drawn one inside the other, the outside square, or rather the country repre- 


Sen: ; M 
ted by it, was larger than all those within it added together. It appears that Crome was 
cal areas on maps; in fact, he used 


E» E Vien by the comparison of geographi 
of the RE of magnitude, not the linear magnitudes moving through time 
There; sian approach. ; 

Om en however, one example of graphical represe: 
Table) fem | and compatriots, namely, G. R. von 
where m, 3). This is an imaginary panoramic scene, com 
their a] » st of the highest known mountains of Europe and America appear m proportion, 
udes being given by reference to a scale on the vertical axis. 


ntation being used by one of Crome's 
Gothe's Hohen Tableau (Altitude 
plete with trees and animals, 


9. In 
; 1817 Crome wrote: 
o this end I published 


in my 35 year career as Docent. T 
e Grössen Karte. This 


which was followed in 1785 by th: 
erháltniss Karle of Europe. 


, Tha , 
in ly. used this method with great success 
Maai my first Produkten Karte of Europe, 


T issued in an improved form in 1792 as the y 
E that the Produkten Karte referred to differed little from the later ones, * and it is 
asha e of the Verhältniss Karte that is illustrated here as Fig. 1. Tt appears as part of à 

arger chart which also gives lists of data referring to military strength, etc. 

E o the base of this chart, published in 1820, isa diagram that Crome seems to have 
Els for the first time. An extract of itisshown as Fig. 2. It consists basically of a series 
ensit bad each representing the density of population in the state to which it refers, the 
a Y being inversely proportionate to the area of the circle. The various half-tangents 

i fed radii denote total population and national income, the lines being drawn in 
ag follow colours to distinguish them from one another. Thus total population is denoted 

8: 

8 oy tangent on the right-hand side of several circles indicates the population in millions 
thin esed on the scale, the radii leaning to the right indicate 100 thousands, and the 
1 d ack lines also leaning to the right thousands. (N.B. The length of these slanting lines 
Crosse ae as the quantity is indicate ber of sections of the vertical "wo 
®ach ey These quantities are then added together to give ‘the total population. In fact, 

these lines represents a group of digits from the original figure. —— 
Baa for example, Luxemburg (third from left) has a population density of 1981 per 
an square mile (from the text below the circle). Its population is indicated by the 
* 
By, This chart is missing in the copies of Crome’s works available both in the British Museum and the 
! Library of Political Science. " 
16-2 


244 Studies in the history of probability and statistics. III 


i its i i being 
extended radius sloping to the right which shows 2-5 units, the m in this wd x 
- » ion i 5 ional income is s 
tion is thus 250,000. The national inco 
100,000 (see above). Total popula 
illi j "di loping 
eee on the left indicates 1 unit, i.e. 1 million Gulden, the rigs vcn a iw 
) i 5 i i 5 Gulden. Total national incom 
indicates 5 units, i.e. 5x 100,000 inc i 
NÉE Be 000 — 1,500,000 Gulden. National income per head, as indicated by 
sadins extended downwards is 6 Gulden. 


Rea rer oor 
1l. Although a little involved and misleading and on the whole giving only a ver y s 
overal picture, this method is perhaps more precise than if the population and ee 
income figures had all been plotted to scale. Basically this could be taken as à 


; r res Crome 
crude attempt at a bar chart, the circles being merely a generalization of the squares 
used earlier. 


the mob in 1791. In the next ye. 
Company, coupled with some 
Playfair to leave Paris for Frankfurt. 


R relayed 
were to be erected across the country to form a chain whereby messages could be re 
and sent them to the Duke of York. 


England. Inactual fact, R.L. Edgew 
of a race at Newmarket to his hom 
opened a ‘Security Bank’ to facilit; 
soon collapsed, and from 1795 Play 


Erica Royston 245 


E p » nothing of the studious or academic background of Crome here, and it is 
d ps her surprising that what Playfair wrote was good sound economics and, 
as he that his works were illustrated with some extremely good graphs, histograms 
ned lagramós. He wrote mainly works on general descriptive economies, one of his 
mie 7 subjects being the international balance of trade. Most of the graphs he used were 
P. de — chiefly export and import figures expressed in millions of pounds; but 
oki es circles to present spatial magnitudes after the manner of Crome. His earliest 
186 m aining such graphs seems to be his Commercial and Political Atlas, published in 
d his consiste of graphs of exports and imports represented by coloured lines, the gap 
He € them being termed the balance in favour of England and usually also coloured. 
hr " ke such graphs for trade to and from most of Britain’s foreign markets. In the 
a in x there are also some very good bar diagrams showing the amount of exports to 
ports from each country plotted against time. 
E Tu inda professed aim in publishing his char 
BSther. & he sense of data relating to state-craft, a littl ore | 
ih rom the following extract from Playfair's Statistical Breviary 
ight of very highly. 
ü or 2 E study is less alluring or more dry and tedious than statistics, 
Seldom the Aie or that the person studying is particularly intere 
ir se with young men in any rank in life. 
Fig. i kis of Playfair's later diagrams were merely improvements of his earlier work. 
One ag bones one such graph where two series have been superimposed upon one another, 
foror graph and the other as à histogram (the two series here being weekly wages and the 
be antif ig of wheat (1821)). He illustrated his British Family Antiquity with several 
elsewh ully executed bar diagrams, which differed from the true frequency graphs used 
1 ere by the fact that only the presence or absence of a given factor is plotted against 


t s i 
me. This type of diagram, Playfair conceded, had long been used in chronology. 


Ax Fig. 3 shows an extract from an earlier example of Playfair's work. Here the areas 


" European states are expressed as circles and in some cases subdivided as pie diagrams. 
ri i, ri on the left-hand side of the circles indicates population in millions and that on the 
is Inte hand side national revenue in millions of pounds. The slope of the line joining these two 

ended to indicate the approximate ratio of one to the other. Playfair nowhere states this 


a i " 
nd is presumably aware of the fact that the differing diameters of the circles make any 
ticular figure has been mentioned earlier as closely 


€x 
s comparison impossible. This par orth E E : 
ilit ng Crome's work (Fig. 2) and therefore warranting he chu 1n. g he possi- 
i Y of there being some connexion between the two writers. Certainly this was possible, 
Crome and Playfair were contemporaries and, moreover, both wrote in the same field. 
Uyfair's diagram was, however. published before Crome's, and it seems unlikely that, 
i phs, he should have copied 


avi č » 

ene had sufficient flair to produce perfect, time-series gra: ' 

me, Conversely, however, it seems equally unlikely, even were it possible, that Crome 
J? ? 


o : , 
= uld have taken a basically clear and simple diagram of Playfair’s and made such a com- 
Ca; E ^ ! : cork R — 

th ted diagram as Fig. 2 out of it —unless, of course, he was working under the impression 


lat i € 1 
he was improving on Playfair's work. On the face of it, with no direct evidence either 
exion between the two it was Crome who was 


ay: 
ih Seems that if there was amy Conn be ^ adel 
Ww Ing from Playfair and not vice versa. But it is at least equally possible hat the two 


ed entirely independently. 


ts was to make statistics, presumably 


e more palatable. Statistics, one may 
(1801), were not 


unless the mind and imagina- 
sted in the subject; which is 


S 


246 Studies in the history of probability and statistics. III 


19. There is nothing crude or clumsy about Playfair’s work and it seems surprising that 
one man should have developed it to such a high pitch—for, unlike Crome, Playfair makes 
definite claims to originality, calling himself the ‘inventor of linear arithmetic’ ashe termed 

: graphical representation. In 1796, for instance, he writes: 


I confess I was long anxious to find out whether I was actually the first who applied the principles 
of geometry to matters of finance as it has long been applied. to chronology with great success. I am 
now satisfied, upon due enquiry, that Iwas the first, for during the 11 years I have never been ableto =- 
learn that anything of a familiar nature had ever before been thought of... i 


and later (1805): 


The impression is not only simple, but it is as lasting in retaining as it is easy in receiving. Such 
are the advantages claimed for the invention 20 years ago, when it first appeared. The claim has been 
allowed and not objected to so far as the inventor knows, either in this or in any other country. 


And as an explanation of how he came to think of such graphical representation (1805); 


I think it well to embrace this opportunity, 
have, of making some return (as far as acknow. 
be repaid by acknowledging publicly, that to 
invention of these charts. 

At a very early period of my life, my brother, who in a most exemplary manner maintained and 
educated the family his father left, made me keep a register of a thermometer expressing the variation 
by lines on a divided scale. He taught me to know that whatever can be expressed. in numbers may °° 
eter was on the same principle with those given Ge 
whom I owe this now fills the Natural Philosophie 


1 1 
the best I have had, and perhaps the last I ever she 
ledgement is a return) for an obligation, of a natur 
the best and most affectionate of brothers I ow® 


even the Academy of Sciences "testified i 
accounts’ (Playfair, 1798). 


i s . 7 he 
efore, it seems quite possible that William Playfair, if not A 
presentation as we know it to-day, was the first to introduce 


3 8 
Crome independently had much the same ideas so far as a s. 
i 


Biometrika, Vol. 43, Parts 3 and 4 Plate 1 


Royston: Studies in the History of Probability and Statistics. 111 


1 i 
Ae /N LI | 


3 j 
i seh -statistische Darstellung. 1820. 
= 1 E omes Geogr phisch-statistist Nie [ 
Figs. 1 and 2. Extracts from Crome s Geogray (Paving p. 246) 


Plate 2 
Biometrika, Vol. 43, Parts 3 and 4 


Royston: Studies in the History of Probability and Statistics. ILI 


Vuisciracsm 
RID. 


M ee a 


HtrrraNTurtng 


| 
Russe ; ~ 
H i i 
H : : 
1 i; 
Agaa Domixroxs 7 "anme 


i 
Fig. 3. Extract from Playf 


^ € CHART, > 


>) ES 
Menzng at n, View ) 


Maine of Th Quarters What 
åen of Cale " 


by the Merk, y J 


Fire Pantene Faplanain. Sy Liin Ie Mri ms 


Vig. 4. Extract from Playfair’s A Levey on our Agricultural Distress, 1821. i 


Erica Royston 247 


graphical presentation of statistics should have had to wait until the end of the eighteenth 
century. The explanation may well be that until the middle of that century there was very 


little to present. To quote Playfair himself (1801): 


Statistical knowledge, though in some degree searched after in the most early ages of the world, has 
Not till within these last 50 years become à regular object of study. 


REFERENCES. 


Crome, A. F. W. (1785). Über die Grösse und Bevölkerung der sämtlichen Europäischen Staaten. Leipzig. 

Crome, A. F. W. (1820). Geographisch -statistische Darstellung der Staatskráfte. Leipzig. 

Eperworrn, R. L. (1797). A Letter to the Rt. Hon. Earl of Charlemont on the Tellograph. Dublin. 
UNKHOUSER, H. Ġ. (1936). A note on & 10th century graph. Osiris, 1. Bruges. 


EARSON, E. S. (1938). Karl Pearson. Cambridge University Press. 
pena W. (1786). The Commercial and Political Atlas. London. 
LAYFAIR, W. (1796). For the Use of the Enemies of England. London. 


RAN w (1798). Linear Arithmetic. Lewes 

E: isti . Lo : e 

: LAYPAIR, d Tod Seir ed 2e Permanent Causes of the Decline and Fall of Powerful 
Peet. London. i 
DARE? Wolles h BrHSh Family An 
VON iw W..(1821). A Letter on our Ag! 
ON Gorn, G, R. (1813). Höhen Tableau. 


tiquity. London. 
ricultural Distress. London. 


Allgemeine Geographische Ephemeriden, 41. Weimar. 


[ 248 ] 


s 
STUDIES IN THE HISTORY OF PROBABILITY AND r ee 
IV. A NOTE ON AN EARLY STATISTICAL STUDY OF LITERARY A. : 


Bv C. B. WILLIAMS, Sc.D., F.R.S. 


+ tne ation: of 4 
In Biometrika for January 1939, G. Udny Yule discussed the frequency distribution 


> 1 hat each © 
sentence length in samples of the writings of different authors. After showing tha 


in. 

author had a fairly characteristic distribution, he turned to the value of the x ha’ 
cases of uncertain or disputed authorship. Thus, in the case of De Sues up n 
showed that the frequency distribution of sentences with different numbers p Moers ; 
closely resembled that of works by Thomas à Kempis than that of works by nr sould 

In Biometrika for March 1940 I showed that the skew distribution found by wie i 
be brought almost to a symmetrical form by using a geometric or logare roi gage ren o 
number of words per sentence, thereby simplifying the mathematical compari fof 
this note I mentioned that some years previously (about 1935) I had made ? d di 
frequency distributions from different authors using the number of letters per an M eta 
variable, but that I had not found any striking differences. I considered Yule's use sible 
number of words per sentence as a better technique, giving a greater range of pos 
variation and comparison. Mer 

Ina letter written to me in June 1939 Yule said: ‘I booked up some ten years ago a m "t 
of distributions of word-length by the number of syllables only. Monosyllables are 7 NE t 
considerably in the majority (if I remember rightly I omitted “a” and “the”), and di pei 
authors diverged a good deal, but, so far as I can recall, the range from Bunyan to à 


ber 0 
Suggesting that similar distributions of the numbers of syllables per word, or the num 


3 thor- 
words per sentence, might well help to throw light on cases of doubtful or disputed au 
ship. 


er 
Through the kindness of MrRushworth Fogg of Glasgow I was puton the track of a par” 


sti 
published in 1901 by Thomas Corwin Mendenhall,* in which he gives a reference to & 


earlier paper published in 1887, both of which I have been able to examine. 


Mendenhall states in his first paper (1887 ) that five or six 


:s easily 
» Was better, as while the average number of letters per word is 
obtainable from his data, 


possibilities of comparison. 


His 
Augustus de Morgan was Professor of Mathematics at University College, To din 
Budget of Paradoxes was first printed as weekly notes in the Athenaeum and republish 


* See biographical note on p. 255 below, 


C. B. WiLLIAMS 249 


Mm 1872, after the author's death. I have examined the second edition (1915), 
BM tio ug : it contains a few references to cases of disputed authorship, I cannot find any 
mM ne oil the use of the average number of letters per word. It may be in one of his 
m" rks,or perhaps in one of his Athenaeum notes that was not reprinted in book 
tothe frequ yoo ue Mendenhall, who was primarily a physicist, was Sübracted. 
E ise. : ency distribution technique by its resemblance to spectroscopic analysis, which 
ES sitios much to the fore in scientific circles. He writes: ‘It is proposed to analyse a 
which Cm by forming what may be called a “word spectrum” or “characteristic curve” 
length a. p be a graphic representation of the arrangement of words according to their 
dior A the relative frequency of their occurrence.’ The mathematics of the compari- 
| UN d distributions was very little understood atthe time when he was writing. 
if they E. hall first discusses samples taken from different books by the same author to see 
and M each other sufficiently closely to make comparisons between one author 
levies, er likely to be profitable. Most of the evidence is given 1n the form of thirteen 
s and, unfortunately, in only very few cases are actual numbers presented. 


b Piae. h first seven graphs deal with various combinations of ten samples, each of 
1ousand words, from Dickens's Oliver Twist and Thackeray's Vanity Fair. In his 
es of 1000 words from Oliver Twist are 


Secon 
8 a figure the distribution of five separate sampl 
“Wn superimposed, and there is no doubt as to their general resemblance. In another 
ords from Oliver Twist and 


1 . 1 H 
8- 1 in this paper) he shows one graph for the whole 10,000 w 
ength of the words 


Nother 
; her for y. anity Fair. There is very little difference in the average l 
ity Fair has rather more words of 3 and of 


icki 
as 4-324; Thackeray, 4481), but Van 
. v 1etters, while Oliver Twist has more of 1, 2 and 4-6 letters. Mendenhall was somewhat 
it is certainly suprising that...so 


Sapho; 
cl A Pointed by the lack of difference and commented * 
an agreement should be found. This is particularly striking in the words of 11, 12 and 


Tt is interesting to note than 


131 
et . . 
ters, the numerical composition of which is as follows: 
Number of letters 11 12 13 
ick 85 57 29 
Dickens be xi es 


Thackeray 
he next tried two groups of words from John Stuart 


on Liberty, in which he ‘expected to find more longer 
onfess to considerable surprise in finding from the 
hole the anticipation was realized, the word which 
Jetter word, as with both Dickens and Thackeray, 
ys ‘is to be found in the liberal use of 
arate diagrams, are here 


Una 
WPs pre by this small difference 
Words hemos Economy and his Essay 
e an in the novelists’. "But le 
Giga aing that, although on the w. 
bu he Most frequently was not the thre 
feos a of two letters.' The explan 
r in US in sentence-building'. The re 
Shae into one (Fig. 2). 
Vesti enhall next studied two addresses given by 
of a n ONS? to two different audiences, one consisting of working men and the other students 
Lista; 'eologiea] College. There was ‘a marked difference in style’, but the word-length 
d, tio a words) were Very similar. He comments that 

T ns (from two samples of 5000 7 
the shortness of the words used’. The 


ay, inson’ a ‘kable in 
e n’s composition was remar 
" hich is; however, only 0-044 shorter than the samples 


a 
ftom P length was 4-298 letters; W! 
lekens, 


e 
ation he sa 
sults, given in two sep 


a Mr Edward Atkinson on ‘Labour 


250 Studies in the history of probability and statistics. IV 


For comparison with all the above studies of works in the English language, Mendenhall 
gave a distribution of the first 5500 words in Caesar's Commentaries, in Latin. He finds a 
mean word length of 6-065 letters and an entirely different form of curve with peaks KE 
2, 5 and 7 letters (see Fig. 3). This is of course connected with the Latin construction 0 
adding to the main root for inflexions instead of using additional small words. 
$ 7 8 9 10 1112 13 14 15 1.2.3. 4 5 6 7 8 9 10 11 12 13 14 fp 


10 11 12 1 
John Stuart Mill 
t+} T Ot} 09 Share 1M = 
© © Oliver Twist 


a E] e o Political Economy 
= X x Vanity Fair E X X Essay on Liberty 0 
9 fh el 200| 9 509 1 4 "es uides 
E Eb 10,000 words fom each = 20 \ = 5000 words from each 
e 
S E à 
= ——— isi Nyt} pt HS 
a Fl S 
3 +++ +1100] 3 fio Ke i 
E E 3 
3 2 
z =Z 


50: 


Eeo 
1233 035 16,57 $ 9 10 11 12 13 14 15 


No. of letters per word No. of letters per word 
Fig. 1 Fig. 2 ir 
Fig. l. Samples of 10,000 words each from Dickens’s Oliver Twist and Thackeray’s Vanity Far. 
Redrawn from Mendenhall (1887, fig. 7). 


Fig. 2. Samples of 5000 words each fro: 
(1887, figs. 8, 9). 


L—L011 4 4 4 1 ecce 
4 5 8 yg 


ll 
m two works of John Stuart Mill. Redrawn from Mendenh? 


[42 3 -4* 516 7 8 9 10 11 12 13 14 15 


1 2 3 4 S 6 7 8 9 10 11 12 13 14 12 


0—1—1——1———41- tot 
$ ' e e Shakespeare, 400,000 words 


X X Bacon, 200,000 words 


Numbers per 1000 words 
Numbers per 1000 words 


"Que. oi 
T WE 
56 7 8 9 10 1112 13 14 15 
No. of letters per word 


Fig. 3 


——6— 9: 1 
28,4 5 & 7 8 9-10 11 1 
No. of letters per word 


of Shakespeare and works of Bacon. R 


reader; the most notable, of course, being the attempt to solve questions of disp" 
authorship, such as exist in refer 


other less widely known exampl 


C. B. WILLIAMS 951 


in tracing the growth of a language. and in studying the growth of the vocabulary from 
childhood to manhood.’ 

‘If striking differences are found between the curve of known and suspected compositions 
of any writer, the evidence against identity of authorship would be quite conclusive. If 
E two compositions should produce curves which are practically identical, the proof of 

common. origin would be less convincing for it is possible though not probable, that two 
Writers might show identical curves.” 
^ It was not until 14 years later than 

opular Science Monthly, published in 


Mendenhall returned to the problem in a paper in 
December 1901. In this he repeats some of the dis- 
cussion and diagrams from his earlier paper dealing with Dickens, Thackeray, John Stuart 
Mill ana Mr Atkinson; but in addition to his earlier diagrams showing analysis of a Latin 
m gives examples of single authors in Italian, Spanish, French and German (see Fig. 3). 
her : rench and Spanish curves, possibly by the idiosyncrasies of the authors chosen, have 

irpeaksin the 2-letter words; the Italian has two peaks at 2 and 5letters; whiletheGerman 


aS à peak at 3 letters, but with more longer words, reaching one word of 27 letters. 

ur m this introduction, he settles down to a discussion of the value of his technique in 
z study of the authorship of the plays of Shakespeare. For this the length of nearly two 
ilion words were counted from the works of Shakespeare and of someof his contemporaries. 


Ost unfortunately the data are all condensed into half a dozen small diagrams, and not one 
able of the actual numbers is given. Does his evidence still exist, hidden away somewhere? 
-endenhall says: ‘The resit from the start, with the first group of 1000 words, was a 
a Surprise. Two things appeared from the beginning: Shakespeare's vocabulary 
dsisted of words whose average length was ? trifle below four letters, less than any writer 
Nglish previously studied; and his word of greatest frequency was the four-letter word, 


thing never n E Fig. 4). 
8 ta ot ee with s of Thackeray and Dickens shows that Shake- 
Peare had a higher proportion of words with 1, 2, 4 and 5 letters, and a lower proportion of 
ords With 3 letters andof 6 letters upwards; which accounts for the fact that while his peak 
— his average number of letters per word is lower. In modern terminology, he would 

eag A: yee 

a a a "e ets were counted including ‘in whole or in part, nearly all 
Er Most famous plays’ End it was found that this characteristic curve 18 most persistent— 
„at based on first A 000 words differing Very little from the whole count. In a diagram 
Ving two example " 200,000 words each, it is practically impossible to separate the two 
85, in spite of E is t di Mendenhall says th have been of necessity slightly 


e differences 
a 
88erated in order to make them show at all! i 
The Comparison was next made of Shakespeare R p 
of 9 1 2e of Lucrece and Venus and Adonis. The pros 


th, letters, atd fawerawords of'5;.6 and 7 letters: but 


3 1 «letter word. Mendenhall writes: < At first this was thoug 
8 ti s : | | 
ads dial € was found un pace by Francis Bacon, including his Henry V. 1 7j 
pa lis Aq penmi o ith a total of nearly 200,000 words. The frequency dis- 
vancement of Learning, wi (see Fig. 4) with the peak at the 


ut; kespeare 
Bo li Was quite different from that of Sha s : madio leid o anare leaped 
Wit, — Word, with more 2-letter words, : 


ader is at liberty to draw any 
-13 letters. Mendenhall here com? 


is poetry as exemplified by 


prose and hi 
gave more shorter words, particularly 


both gave the characteristic peak at 
ht to beageneral characteristic 


fewer Wl 


ments that ‘the re 


252 Studies in the history of probability and. statistics. IV 


conclusion he pleases from this diagram. Should he conclude that, in view of 2 =. 

ordinary difference in these lines, it is clear that Bacon could not have po 

ordinarily attributed to Shakespeare...the question still temany, who did? -— 
An examination of the works of Ben Johnson, in two groups of 75,000 words, sho iet 

more a peak at the 3-letter word; but an extensive study of the plays of A hes 

Fletcher showed that on the final average the number of 4-letter words was sligh ad S le 

than those of 3 letters, although the excess was by no means persistent in smaller "er 

The final curve was not unlike that of Shakespeare, and Mendenhall suggested that 

of persistency of form among small groups’ might be due to the dual authorship. 


Log scale 


0 02 04 06 08 10 


E 
E 
9 10 11 12 13 14 15 § 
teag 2-9 7 0 3 AL E NE END oe E 
t+ + + 44 ar t+ +050} 9 
p e è Marlowe > 
4 * X X Shakespeare = 
S oo 4 i£—4—34—1—1—1—4—4—4—Q00| o 
E E 
S 5 
= [150-—# ++ SSS 2 (89 
b] a 
= 3 
$ 10 4 E—-I———3———4- i—100| s 
2 o 
B 8 
D 
= msi] $ 
E 

ign E 2 3 4 567891012 

12345 6 7 8 9 101112 13 1415. 3 No. of letters per word 
d 


No. of letters per wor 
Fig. 5 Fig. 6 jays 
Fig. 5. Comparison of frequency distribution of word length in two large samples from the play 
Shakespeare and of Christopher Marlowe. Redrawn from Mendenhall (1901, fig. 9). 1 Shake- 
Fig. 6. The number of letters per word, on a logarithmic scale, from works of Thackeray and of the 
speare (as shown by Mendenhall) plotted against the accumulated total as a percentage 


istributioD 
whole sample on a probability scale. It indicates some resemblance to a log-normal distr 
for words up to about 8 letters, but differing above this level. 


of 


When, however, he turned his attention to the plays of Christopher Marlowe inet 
akin to a sensation was produced among those engaged in the work’. ‘In the character wt 
curve of his plays Marlowe agrees with Shakespeare about as well as Shakespeare a£ 
with himself? (Fig. 5). itten bY 

Finally, Mendenhall pointed out that a dramatic composition Armada Days writ d the 
Prof. Shaler of Harvard, in which the author endeavoured to compose in the spirit pes o 
style of the ‘Elizabethan days’, gave a curve (from only about 20,000 words) with ' exe no 
the 4-letter word and in other respects decidedly Shakespearian' . Mendenhall does 
give this curve or any figures. 

Discussion 


We are not concerned here so much with the results that Mendenhall obtained, OT e 
their repercussions, but rather with the general value of the technique. There ane n 
be little doubt that he was the first to act on the suggestion of de Morgan, and that caet ; 
method of using the frequency distribution, instead of merely the average length of jven 
was a distinct improvement, although the average length would not normally bag 


jth 


C. B. WILLIAMS 253 


to-day withou ; cues 

less reliable ee -ani skew form of the curve makes this latter measure 

NE nose al sampling method was to take blocks of 1000 words each ‘at the beginning 

ihe eam after a few thousand words had been counted, the book was opened near 

(MN - = the count continued’. This method is not above reproach, but, in view of the 
umber of samples and. the general close resemblances, it is unlikely that a more 


randomiz 
omized method would produce any measurably different result. In the case of the 


play. ; : 
ys of Shakespeare the sampling was large enough to justify the statement that it included 


near 
cd all the most famous plays ‘in whole or in part’. 
hat Mendenhall appreciated the difference between the statistical method and evidence 


MN selected phraseology believed to be characteristic is clear from the following 
a E The chief merit of the method consisted in the fact that itsapplicationrequired 
bm mm of judgement’ and that characteristics might be revealed which the author 
‘the edm no attempt to conceal, being himself unaware of their existence’; and again, 
a ons per sions reached through its use would be independent of personal bias, the work 
other? rson in the study of an author being at once comparable with the work of any 

E Mendenhall saw the wide range of possibilities is clear from his statement: “it is 
ld, ten to say that the method is not necessarily confined to the analysis of a com- 
Words i y means of word-length: it may equally be applied to the study of syllables, of 
: n sentences, and in various ways. And I have already quoted his suggestion as to 


lts y B 
Value in comparative studies. 
terest. The curve of the frequency distribution of 


Ty E 
words, additional comments may be of in 
of different lengths is in every case skew, with the peak usually at 3 or 4 letters per 
or 16, but sometimes to higher than this. 


Wor 
n ES and the tail running off generally to 15 
Y contribution tothe study already mentioned, I showed that by the useof a logarithmie 


Cal s 4 
io i the skew distributions of sentence length became approximately symmetrical, and 
na e distribution resembled a log-normal. Tt is of interest to see if Mendenhall’s figures 

Word length show a similar relation. We however, note beforehand that the length 


a 

ien die is under the conscious control hen he pleases. The 

is no : of words are not so controlled and se 
5 likely to occur. 

87 St unfortunately only three sets © 
Pai Te er. They are for 1000 words in Oli 
letten aking the latter we find that the accur 

8 per word, expressed as percentages of th: wi 

t E these results are plotted on to log-probability 
tha 'S an approximatel straight-line relation up to l bab 
» there is a definite pedem The straight-line portion suggests à log-normal distribu- 
, With a mean log at 0-53 and as iation of approximately 0-26. On an arith- 
ine Scale this is se alent to a ge about 3:4 and a standard deviation of 

Or. 1V: à à 

- I-8.* The arithmetic mean is 4? le 


Ww 
ma, hen a fi aain ig skew on an 
'ely requeney distribution 19 skev 
^t. | Symmetrical han a geometric senle is used, the standar! 
ap ro “metic scale as ‘+ Or —' The use of the egt vote les 
ang Mately ie f aall be between 9" and 3: 8:3 

da. tely 33% of the observations LS oum and below, these limits. 

Ww: 
A 


* L8; and approximately 17% 


can, 
of a writer, who may stop W 
Jection of words for reason of their length alone 


f numbers are given by Mendenhall, all in his first 
ver Twist and two setsof the same size for Vanity 
nulated totals up to each successive number of 


e whole, are as shown in Table 1. 


paper the result is as shown in Fig. 6. 


about 8 letters per word, but above 


tandard dev 
ometric mean of 
tters per word. 
arithmetic grouping of the data but approxi- 
d deviation cannot be expressed on 
‘3-4, x or + by 18’ implies that 
33% will be between 3 


254 Studies in the history of probability and statistics. IV | 


In his second paper Mendenhall gives five graphs showing Shakespeare's i A 
tribution per 1000 words in comparison with other authors. With a lens and a fine p "- 
possible to read the numbers to about three units, but unfortunately the results so o pe, 
from the five diagrams do not agree. This is possibly because (as he admits in one 21 
Mendenhall exaggerated the differences in the diagrams in order to separate the two "- 
I have made an estimate from each of the five diagrams, and the average values are giv 


Table 2. 


Table 1 
No. of letters | No. of words Accumulated | Accumulated 
per word out of 2000 | no. of words total | total % of 2000 

| 1 58 58 | 2-9 

2 315 373 | 18-7 

3 480 853 42-7 

4 351 1204 60-2 

5 244 1448 72:4 

6 | 154 1602 80:1 

7 152 | 1754 87-7 

8 100 1854 92:7 

9 63 1911 98-9 

10 43 1960 98-0 

11 | 16 1976 98-8 

12 15 1991 99-6 

13 4 1995 99-8 

14 5 2000 100-0 

| H 
! 
Table 2 
Letters Words Letters Words Letters Words 

1 47-6 6 71-2 ll 3:4 
2 175-8 7 52-6 12 2:0 
3 225.0 8 31:6 13 10 
4 237-6 9 18-4 14 0-4 
5 124-4 10 9-0 — — 


When the accumulated tot 
the result (Fig. 6) indicates 
again the break is more dis 


: e 
als are plotted on log-probability paper as in the previous ee 
a fairly regular departure from the straight line, although ° 
tinct above 7 letters per word. ords 
I have also attempted to get, from Mendenhall’s diagram of five samples of 1000 W 
each from Oliver Twist, some measure of the error of his results. 


; oxi- 
The frequency distribution of words of certain lengths in the five samples is app” 
mately as given in Table 3. 


If the size of the sample were inc 
the pattern of the material sampl 


C. B. WILLIAMS 955 


ES m the comparison of two samples of this size—assuming the same order of variation 
Eo ui ef the difference would be approximately 1-4 times the above or 2-9, 1-7, 
o - Sms differences in number of words per 1000 would have to be of the order of 
i e^ 3 and 2-1 to be significant at the 1 in 20 level, and 7:5, 4-4, 5-4 and 2-8 to be signi- 
E s he 1 in 100 level. The five samples on which the above rough estimate is made 
Briten gros consecutive samples of 1000 words from one work; when different works, 
E rDa ifferent periods by the same author, are combined the error of the mean would 
ertainly be greater. 

i, egi examination of Mendenhall's diagram giving the comparison of the distribu- 
Dus. hakespeare and Marlowe suggests measurable differences only in words up to 
Bee, s, Marlowe differing from Shakespeare approximately as follows: 1 letter, 5 less; 

ers, 3 less; 3 letters, 3 more; 4 letters, no difference; and 5 letters, 5 more. All the other 


Table 3 


Sis axi i M S.E. of mean 
etters Five samples a for 5000 words 


8 221, 232, 236, 254, 268 249-2 8-36 
4 110, 175, 183, 186, 198 182-4 4-82 
9 | 98, 102, 120, 122, 123 1124 5-80 
6 83, 92, 94, 97, 103 93-8 3.28 


Word . 
vd lengths are indistinguishable in the diagram. These differences may have been 
's are in words per 1000 in large samples—in the 


but in the case of Marlowe the size of samples is 
of Bacon with Shakespeare (see Fig. 5) 


t 
Would seem likely that real differences betw 
ection for several consecutive word lengths. One 


scious preference for longer 


Wi 
ords iti ikely that one author would prefer words of, say, 11 
e 12 to the 11. Thus, rapid 


would be less convincing than blocks of depar- 
d would be more likely to be due to 


bin. in preference to 12, while another au 
Bes of departure directions in sequence 


"re, 
8 Of sim; ; : i š i 
error, Similar sign as evidence of real differences, an 


e Mendenhall, in his 1887 paper, calls attention to the fact that in an analysis of Dickens’s 
th *stmas Carol, words of 7 letters appeared to be unduly numerous, due to the fact that 
T E aracter ‘Scrooge’, frequently referred to,isa word of thislength. It would be desirable 
ave names of persons and places out of any tabulation. 
BIOGRAPHICAL NOTE 
195,186 Corwin Mendenhall was born in Ohio on 4 October 1841 and died there on 22 March 
(Prop, He was the descendant of a Benjamin Mendenhall who emigrated from England 
ably from Wiltshire) in 1686 to join Penn’s Colony and who settled at Concord, 


256 Studies in the history of probability and statistics. IV 


Pennsylvania. T. C. Mendenhall spent some early years as a school teacher, but in 1873 
became the first Professor of Physics and Mechanics at the newly founded Ohio Agri- 
cultural and Mechanical College. He was Professor of Physics in the University of Tokyo 
from 1878 to 1881 and in the Ohio State University from 1881 to 1886. He then became 
President of the Rosa Polytechnic Institute in Indiana and was elected the following year 
to the National Academy of Science. After a few years as Superintendent of the U.S.A. 
Coast and Geodetic Survey he became President of the Worcester Polytechnic Institute, 
where he remained until his retirement at the age of 60. 

His biography by Henry Crow (Biograph. Mem. Nat. Acad. Sci., Wash., 16, 331-81), to 
which I am indebted for the above information, lists about sixty publications in physics 
particularly geophysics, units of electrical measurement, state boundary lines in the USA. 
and many other related subjects. The first of the two papers at present under discussion 18 
listed, but not the second. There is no mention of his interest in the statistics of literary 
style, but it is said that he left in MSS. about 900 pages of an autobiography which has 
never been published. If still available, it might repay study. 


REFERENCES 


MENDENHALL, T. C. (1887). The characteristic curves of composition. Science, 9 (214, supplement): 
237-49. 

MENDENHALL, T. C. (1901). A mechanical solution of a literary problem. Pop. Sci. Mon. 9, 97- 

Dr Moreay, A. (1872). A Budget of Paradoxes. London (2nd edition 1915). j 

Wittiams, C. B. (1940). A note on the statistical analyses of sentence length as a criterion of literary 
style. Biometrika, 31, 356-61. 

Wrams, C. B. (1952). Statistics as an aid to literary studies. Penguin Science News, 
pp. 99-106. 


ug TT ^ ed On sentence-length as a statistical characteristic of style in prose. Biometrika 


105. 


no. 24s 


1 


[ 257 ] 


A GOODNESS OF FIT TEST FOR SPECTRAL DISTRIBUTION 
FUNCTIONS OF STATIONARY TIME SERIES WITH 
NORMAL RESIDUALS 


By A. M. WALKER 
Design and. Analysis of Scientific Experiment, 6 Keble Road, Oxford 


| 1. INTRODUCTION 


In 

Bur cent years problems of statistical inference associated with the spectral analysis of 

1 Wan rur time series have been studied by a number of authors (see, for example, Bartlett, 
» 1954; Grenander, 1951; Grenander & Rosenblatt, 1953; Whittle, 1951, 1954). In 


Mo; ? s 
po these problems the aim has been to obtain information about the spectral density 
ction of the series, which is usually assumed to be absolutely continuous. Grenander & 

al distribution function, 


o 
t enblatt (1953) give asymptotic confidence bands for the spectr 
Still require the assumption of an absolutely continuous spectral density funetion. 
associated directly with the spectral 


ste Present paper adopts an approach which is à 
dia 1 ion function, and yields a general large sample goodness of fit test of the hypothesis 
Nece Stationary Normal series has à specified spectral distribution function which is not 
Ssarily absolutely continuous but may have certain points of discontinuity. 


iv 2. THE BASIC TEST STATISTICS AND THEIR MAIN PROPERTIES 
S . 
É Uy (£— 0, +1, £2, ...) bea time series which is stationary to the second order. We 
tc without loss of generality that X(t) is measured from its mean, i.e. that 
01-20 
liii. 
hen it is well known that we can write 


xi) -[ erazo) a) 


(the ; l 
a X- d 
Orth 'ntegral being interpreted in the m e Z(o) isa complex-value 


98onal process defined over (—7> 7) witht 
E| Zlo) — B(s) n = F(ws)—F(1): 
EUZ) - Ze) (Zlo) - Z())1 = 0 


(01; 02) (Wg; 04) do not overlap, 
essarily normalized to unity). F(o) is 
e autocovariance function 


ean square sense), wher 


(2) 


T( ) when the intervals 
®) be; 

Donc E the spectral distribution fune 
o» ., SBative and non-decreasing Over (- 


Xp 
= E[X (t) X(t4-7)] by the equation 


tion (not nec 
m,n) and is related to th 


aX p(t) = (3) 


i 7 ger dF(o). 


" 
* denotes complex con, ugate. 
The symbol F J | 


258 Goodness of fit test for spectral distribution functions 
The inversion formula corresponding to (1) is, with Z(0) — 0, 


Zw) = z- X [enr xo, e 


for every continuity point w of F and therefore of Z. Here the infinite sum is to be inter- 


preted as a mean-square limit lim x and (1—e~*”)/(ir) is to be replaced by w when r = 0. 
n—o0o r=—n 


This suffices to determine Z(w) uniquely for all w in ( — 7,7), since at points of discontinuity 
it can always be defined to be continuous on the right (corresponding to F(w) being con- 
tinuous on the right). 


Let A(w) and B(w) be the real and imaginary parts of Z(w). Then we have 
ELA (w) — A (4)? = ELB(v3) — B(w,)P = HF (2) — F(o,)], (5) 
El{A (w4) — A(3) {A (2) — A (91))] = EHB) — B(w,)} (B(v) — B(v,))] = °) . (8) 
when (w, v5), (w3, w4) do not overlap, 
and E[(A (o4) — A(w3)) (B(v3) — B(»,))] = "| (7) 
for any intervals (w, ws), (Wg, W4). 

These results do not follow directly from (2), but can easily be established using (4) (cf- also 

Doob, 1952, p. 482). 


Suppose now that we are given a set of observations at 2n-- 1 consecutive times, 88 
X(—n), X(—n+1), ..., X(0), X(1), ..., X(n). Then we can approximate to A(w) and B(o) 


by the finite sums 
; 1] ^ i 
Ano eg. X (Ent), E: 
eU r-—m 
lom es 9 
and Bao) => Y (=en) XQ). (9 


Ifnis fairly large, we should expect that (5), (6) and (7) willremain approximately true Me 
As, (v) and B,,(w) are substituted for A(w) and B(w), provided that w is a continuity Po” 
of F. This suggests that quantities of the form 4, (v,) — Ag, (4) or Bs, (We) — Bz, (1) E 
be used to construct tests of the goodness of fit of the observations to the spectral distribu 
tion function F(o). 

We therefore take such quantities as our basic test statistics. Since D,,, is an even fun * 
tion of w and 4,, an odd function we need consider only non-negative values of w; in HS 


w LAE d. : i d 
of this it is convenient to express our results in terms of the spectral distribution functio 
F, (w) defined over (0, 7), which is such that 


d, F (v) = 2d_,F(—o) = 2d, F(w) (10) 


(0<w<7) and o%p(r) = [eos t 
0 


The main pro 
as follows: 

(i) Let (w, w) be a continuity 
continuity points of F^ (o), 


B S re 
perties of these statistics, corresponding to equations (5), (6) and Oe 


: , re 
interval for F, (o), i.e. an interval whose end-points a 
with o, >7/n and 7—w,>7/n, and let n be so large that 


F lortAnfn)- F, (0,— ànn) 6 —1,2), where à= 0(1), 


so dj 


l A. M. WALKER 259 
is small compared with F, (v5) — F. (v;). Then to a good approximation 
E[As, (03) — Arlo) = E[ Bon (2) — Ba, (OP = 4LF. (93) — £.(0y)]- (11) 


Als H 
Rw is adn and F, (os Arn) — Foo — Ann), F,(Am/n) are small compared with 
E ^ "i puto,-0in ü 1) provided that F, has no discontinuity at o=0. A similar 
(i) ^ s with w,=7, provided that F, has no discontinuity at o=7. 
Peis aes w) and (wz, w4) be continuity intervals for F, (v), and either 3 — w> 7 [n or 
E [fos (n/n) and F(4(W_+ 3) +Am/n) — F(3(W2+3)—Am/n), with A= O(1), be small 
pared with F(w,)—F(w), (v1) — F(w). Then to a good approximation 


E[(As; (v4) — Aou (v3)) {Aen(@2) — As, (01))] = M 3 
EBan (01) — Bas (o3) {Bon (02) — Bos (091 = 0- (8) 


fi) Bor any intervals (o; Wg), (€; 4) 
E[(As, (01) = Ag, (s)} {Bon(@2) — By,(0,))] = 9- 
This last result, which is exact, follows at once from the fact that in 


Y (= 13 — COS TW. : (= $0 — Sin sas) TE 
r 8 


T,$——n 
t 

es with (r, 8) = (ro So) and (r,s) ; 
in nios 0). The derivation of (i) and (ii) is more complicate 
Which LL the accuracy of the approximations (it will be no 
of thes S may be expected to be high have not been stated at all ) 
ber e at this point would mean à fairly lengthy digression from the main ar; 
No » and is therefore postponed until the concluding section (§ 6). 
Cd let (X (t)) be a Normal series. Then As, (o); Bon() are Normal 

» from (11) and (13), 
a [Aen(2) — Anl 
? Under the conditions stated in (i) above, distributed approximately as 


LEP, (09) — F (91) X 


x. : , 
m noting a random variable distributed in the standard x m d 
om. Also, using (12), it follows thatif 0 = € 91 «9g... < Og = TT188 subdivision of the 


n £ , € , 
i terva] (0, 7) such that Wo 01; es OK are continuity points of F,(0), and bee PORON 
islarge compared with thatin (w; — Ar[n, w; + An/|n) 


ing 
a aiie F,)ineachinterval (vi. 9: - 

Uraanin anrr Anh A= 00) 6 = 0, 1, ..., k— 1), then 
) - Ag (o) + Ban (i= Oy en) (14) 


= (— ro: 8) cancel, there being no contribution from 

d, and certain difficulties arise 
ted that the conditions under 
precisely). The discussion 
gument of the 


random variables, and 


wo)? + [Bon(02)— Bon (o)? 


2 form with m degrees of 


i 


04141) — Bon (CAJE 


F, (0) Xa independently of one another. 


are [A PCS 
b Pproximately distributed as HF (%s+a)— 
‘Should be noted that » 

x © ita [X (t)] is not specified a priori we may replace X(r) in (8) and (9)by X (r) -X, where 
18 the mean of the set of observations. This introduces an additional term n (8) which is 
tly large ” does not affect the approximations (11) 


Ord, ki 
p s nÈ, and hence for sufficien: 
3 

inui i w) we take w >0 or Oj <7 
t When 0 or 7 are not continuity points of F0) , E 


260 Goodness of fit test for spectral distribution functions 


(ii) the assumption that the number of observations is odd is not important since with an 
even number, 2n + 2 say, we can, in defining A,,,(w) and B,, (o), either omit one of the end 
observations, the loss of information being asymptotically negligible, or make the limits 
of summation —n, n+ 1, which gives an additional term of order n-1. 

It does not seem possible to make any general statement about the distributions of Aan) 
and B,,(w), and hence those of the statistics (14), without the assumption that {x (0) is 
Normal. We might, for example, wish to consider the case of the general discrete linear pro- 
cess, defined by X(t) — D g(t —u) e(u), the e(w) being mutually independent with a common 


u=—0 


distribution, but the limiting distributions of A,,(w), Ba (w) as no will clearly depend 
on the form of this distribution, e.g. for the independent process {X(t)} = {e(t)}, the mth 


cumulant of A,,(w) —- {om + 25 (=) Km(€)- 


3. THE GOODNESS OF FIT CRITERIA 


Let the null hypothesis be that (X (t)) has a specified spectral distribution function Fn 
Then we can construct statistics 


T,= [A (055) 7 Aon (o + (Ba, (07,4) = Bono) PIHE (0,1) Fo} (i= 0,1, d 
which for sufficiently large » will, to a. good approximation, be distributed independently as 
multiples of yj, these multiples being unity on the null hypothesis, and either greater or 
less than unity on the alternative hypotheses. Equivalently s? = 47, will be approximately 
independent variance estimates each with f;—2 degrees of freedom, the corresponding 
variances c7 being equal to unity on the null hypothesis and having arbitrary values on i 
alternative hypotheses. From this property we can easily derive suitable criteria for testing 
goodness of fit. 


The most obvious one to use is the standard likelihood ratio criterion 


k-1 
M = —2L—Lmox] = X filst—1—logs?) 


Do. 
Lg 
=o 


(7; — 2log T;)+ 2k(log 2 — 1). (26) 
0 


i 


On the null hypothesis we have M ~ (§) Xd. using the approximation due to Bartlett pua 
(which should be sufficiently accurate except perhaps for small k, when it may be advi sab 
to use the better approximation, due to Box (1949), based on the standard F distribution): 
For large £ it is worth while modifying the X? approximation to M ~ 1-118X4.0321»: whioh, 
if we neglect the effect of the approximations of $2, reproduces the mean and variance ie 
M exactly, (M) with an error of about 3 %, and k,(M) with an error of about 8 76 (this 
is easily verified from the expression k{logT (1—2¢)—24—(1—2¢) log (1 — 2¢)} for the 
cumulant generating function of M, log H(exp M¢)). 

An alternative criterion is obtained by assessing the significance of each T, separately: 


using a two-tailed test, and then combining the results by Fisher’s method (see, for example, 
Fisher, 1941, p. 95). Thus if 


D; -exp—iT, welet y,=1—p; el 
Ppi (p<) 


A. M. WALKER 261 


and refer U = S w; where u; = —2log2y; to a x? distribution with 2k degrees of freedom, 
ie 
high values being considered significant. With this procedure there is the advantage that 
from the individual values of the p;, the intervals (w; w;+1) for which there are highly signi- 
ficant departures of the numerators of (15) from their expectations on the null hypothesis 
are immediately identifiable. However, the form of the power function of the test may be 
unsatisfactory. This is indicated by its behaviour when £ is large (but still of course < n). 
ormal approximation to the distribution of U, that for 


For it can be shown, using the N 
4, the power function is then approxi- 


alternative hypotheses with €; = (1/03) — 1 of order k^ 
mately equal to les o£, + 0-420), (17) 


Where (a) =|" e dy] (27); £, is such that @(&,) =1-%, & being the significance 


level of the test, and 0 = is c; (see Appendix, $1). Clearly the test is ineffective for such 
i-o 


hypotheses with 0 0, since then (17) < æ- 
This difficulty is overcome by modifying the U test as follows. Let U, = 21%; Us = Eat. 


where X, denotes summation over the k, values of i for which p; > $, and =, summation over 
the remaining ky = k— k values for which p; < 3, and refer U,, U to x? distributions with 
2l and 2% ram of f fa adori respectively, rejecting the null hypothesis if either is signi- 


ficantly large. This modified oriterion is, for large k, sensitive to deviations e; of order k-* 
giving positive or negative values of 0 (Appendix, §2(9)). The comparison of the power 
functions of the modified U test and the M (likelihood ratio) test is in general a difficult 
Problem, but the fact that, for large k, M is sensitive to deviations e; of order k~? (Appendix, 
$302), supports the view that the modified U test is, on the whole, the more powerful. 
here are of course other criteria which are more canny for particular types of alter- 
native hypothesis. Among these may be mentioned max Ti the kis, nd of the 7, which 
Would be appropriate for alternative hypotheses under which i« Tire TORE - pn eper pral 
distribution function (or & large peak in the spectral density function, its derivative) 
Occurred at some point. The distribution of max T; under the null hypothesis is obviously 
Sliven by Pr {max T» aj-1- a- ey. 

asis usually the case, F,(w) isnot completely specified 
ber of parameters which have to be estimated 


stent estimators of these parameters are available. For 


i i ters occurring in tl 

then t itution of the estimators for the parame g " 

ig effect of er Man ible when ” js sufficiently large. For example, the variance of 

the ird of bo : E n a i, at most the normalized spectral distribution 
es, g*., will seldom g , 


a prior : : 
function T. (O)IF, (7) being specifie ually substitute the estimator 


These tests can also be applied when, 
Under the null hypothesis, but contains à num 
rom the data, provided that consi 


d, and we can us 


(Xe) -52n 


to 


n 

= È 
r——n 
ean-square error of order n~, when X, | p(7) | 


Which is certainl i at ir : 
y consistent, having ? ‘oa Ti gressive defined 
5 s 1 X [ sa inear autore, 'essr pre cess. e je 
converges. Again, if on the null hypothesis e Qu : 3 à 


by the stationary solution of 


inept tar PP e FB, 


262 Goodness of fit test for spectral distribution functions 


where the Y (f) are independent Normal variables with zero mean and constant variance, 
and the roots of z? - a 2214... La 'p = 0 have moduli less than unity, we can substitute the 
usual least squares estimates of the parameters a, ... a. 

With the likelihood ratio test we can, in fact, avoid the estimation of oł by taking the null 
hypothesis to be the equality of the o?, their common value being unspecified. M then 
becomes Bartlett’s criterion for testing the homogeneity of a set of variance estimates. 
Again the test of the significance of the largest T, can be made independent of c% by taking 


k—1 
the criterion to be g = max T, / X Ti, whose distribution on the null hypothesis, given by 
i=0 


[1/z] / 
Pr(g>a) = X (-1y (P (ray, 


[1/z] denoting the integral part of x, was first obtained by Fisher (1929). 

It is worth noting that if the spectral density function f+(%) exists and is continuous, 
then when the difference between its upper and lower bounds in (Wi, 9;,4) is sufficiently 
small, we may, in calculating 7}, replace F,(0 443) — B (04) by (9;,, — 0) 30. (05,4) +f) 
The advantage of doing so lies in the fact that the process is then usually of the generalize 
autoregressive form for which 


X(t) x a, X(t - 1)... oa, X(t—p) = V(t) +b, V(t—1)+...+, Y(ti— g), 
giving immediately 
flo) = (o%/m) | (1+ b, efo E... +b eniv)|(1 tae +... +a, eio) |2; 


although, since f. (w) is a rational function of cos w, F, (w) can be obtained explicitly in terms 
of elementary functions, it may be quite a complicated expression. 


4. COMPARISON wiTH TESTS BASED ON PERIODOGRAM INTENSITIES 
The periodogram intensity for angular frequency w is given by 


T, n = H, 2 B 
2 +1(@) me uid (18) 
where Hony (0) = lm X eX(). 
r=—n 


It has been shown that when the spectral density f.(w) exists and is continuous neat 
opt R ..., k— 1), the statistics J, 41(@;)/7f,(@;) have the same distributional propertie, 
as the T, i.e. for large n are, to a good approximation, distributed as X2» provided tha 
Oi41— 9:7 7 [n and some suitable condition is imposed on the autocorrelations p(r)—the 
convergence of X, | p() | is certainly sufficient (see, for example, Bartlett, 1950, p. 4). It 
might be thought that there would be a direct connexion between T, and 1,,, 43 (0), but this 
is not so since 


{Aan (2) — Ag, (0,)}? + {Bap (2) — Bos (3)? = 27°(2n-+ 1) i) POT H 


T Our requirement on the spacing of the w; for the results (11) and (12) to hold to a good approxima- 
tion then becomes o,,,— 0,2» T[n. 


A. M. WALKER 263 


Goodness of fit tests using these statistics may thus be obtained in the same way as those 
] authors; for example, the likelihood ratio 


of $3. Such tests have been proposed by severa 

test is discussed by Bartlett (1950, pp- 1-9), and the Fisher g test by Whittle (1952a, p. 47) 

and Sargan (1953). Their application need not be confined to the case of null hypotheses 

E Which the spectral distribution function is absolutely continuous. For we can write 
+() = FP(w) + Fw), where 9X») is absolutely continuous and F®(w) is a step function 

With a finite or enumerably infinite set of discontinuity points (cf. Bartlett, 1955, p. 163), 

and hence can always choose the w; so that f.(w;) exists (in practice this will mean that 


F(@,) is also continuous near w;)- : 

, When f (t) is a sufficiently smooth function the power of one of these periodogram inten- 
aty tests should be approximately equal to that of the corresponding test based on the T; 
With intervals (O; @j34) such that the periodogram frequencies are Loit Oi). For let 
the spectral density be f(w) on the null hypothesis and fo) +FP(@) on an. alternative 

4 linear in (Wi 9; 4). 


n YPothesis, Then when the variation of f w, fO with c is approximately l 
€ expectations of the variance estimates sł on the alternative hypothesis become approxi- 


Mately equal to 1+ |/? ottaa) f 0x n in both cases, and these determine the 
+ 2 2 : 

Power functions. The tests of $2 should be better for detecting large peaks in, f. (v)of unknown 

Scation. For with the intensity tests; such peaks will have squscislile probabi ities saab 

detected only when they occur at frequencies whose differences from some w; are of 
Order 7-1, 

Howey. ; ; for all w intensity tests of greatly increased power 

er, when f (v) is continuous for al 65 A y 

May be obtained by aang frequencies 0; = 2nj|(2n+ DO = i ae (t MISES € 

argan, 1953). For then the intensities Tonsa(oj) and L,a(oy) (JFJ) are asymptotically 

n write X(t) = X g(u) e(t — u), 

u=0 


sually be the case) we ca 
with zero mean and constant variance 
ws most easily from the asymptotic 


u : 
"correlated, at least when (as will u 
ariables 


where th i dent Normal v 
E e e(t) are indepen this follo 


rel t 9 (u) >0 exponentially as ù> oo; 
ation 
Ix) (nf (o)/9 A) Too) 


(0) and {e(t)} (see Bartlett, 1955, p. 279). It is reasonable to 
) the asymptotic distributions of the 


Whittle and Sargan ie dis 
the fact that there are now 7” intensities instead of a finite 
rated for the M and U tests, although it seems difficult to 


t for the g test. We may also note that, as was pointed out 
y tests can be used when (X(0)) is a discrete linear 


(19) 


pStweon the intensities for {X 
PM that (as assumed by 
Me will not be affected by 
Con: er k; this is easily demonst 
b Struct a satisfactory argumen? 7. 
Y Sargan (1953, p. 148), the intensit P 
Process. This follows at once, since with X(t) = qwe we have the relation (19), 


n n 
ations ofn? X e(r)eosor andn Y e(r)sior 
r-—n r-—n 


a i aem 
nd by the central limit theorem, the distrib 
ariance unity, at least when the third absolute 


t 
5 to the Normal form with zero mean and v 
Oment Æ | e(t) |? is fini r TE 
2 | e(t) |? is finite. d of individual intensities, integrals of the form 


We could alternatively consider. jnstea 


s 2n ples H s 
K (015 02) -Í LI, 4 (0)do = 2 3 (1 aire i) C,(sin ws — sin w,8)/s, (20) 
On s=— en 


264 Goodness of fit test for spectral distribution functions 
2n1—|s 


where C, — ¥ Ke Xr+ [5D/(2n -1— | 5|). Then for continuous f. (w) we have 
r=1 


lim E{K,(w,, 02)} = 27 Í “hojo, (21) 
lim nyar Kufor o) = 7 [^ Gur oye (22) 
n1—-co o 
and lim n cov {K (os, 099), Kp (03, v,)) = 0, (23) 
no 


when (w, w) and (3, w4) do not overlap (see Grenander, 1951, pp. 521-3; e 
Rosenblatt, 1953, pp. 541-3). It can also be shown that under wide conditions (any whi 


7.00, to the Normal form with zero means and finite c 
that 


KK, (0;, 0,,)) — 2n [f (odo —o(n-Y) as noo, 
[271 


Which is certainly true if f. (v) has a bounded derivative ( 
Theorem 2), the Statistics 


v; = nt [x.t 01,1)— 27 [7^ 7, (o) ao] Jie ug, ye do| ; Ph 


where the intervals (Oi, w41) do not overlap, 
independent Normal variables with zero mear 


53, 
Grenander & Rosenblatt, 195 


x as 
are for large n distributed approximately 
n and variance unity. right 
tests based on the v,; for example, we mig 


k-1 ii "red 
use the criterion 2 9, rejecting the null hypothesis if this is significantly large when refer 
i=0 


"ful 
to a x? distribution with & degrees of freedom. Such tests should be much more power 


n7, However, there is the disadvantage 
fo) completely, the substitution of esti 
the asymptotic distribution of the test cri 


which may be of order unity. 


5. NUMERICAL ILLUSTRATION 


à f fit 
To illustrate the methods described in $3, these were applied to test the goodness O 


of m ressive 
of a set of observations from a particular series (X(/)), a third-order linear autoregre 
process with 


25) 
X(¢4+2)-1-1X(¢41)40-5xX(t) = Y(t), HCE) =O Y a es any [en 


e e 
o; = gain (i = 0,1, 7: 24) so that each interval was of length Ya = 87/n; this might n 
thought rather small, but an examination of the exact expressions for the variances an 


A. M. WALKER 265 


co i oa 
ee of the quantities A5,(j41) —Aon(;): Bon(Wiz1) — Ba, (v;) along the lines of $6 
cated that the y? approximations to the distributions of the 7; should be fairly accurate. 


From (25), we have 


fw) = Lir | (1 — 1 lete + 0-5e?) (1 — 0-169) |°] 
= 1/[7(2 cos? o — 3:3 cos w + 1-46) (1-01 — 0-20 cos w)]. (26) 
d against w is shown in Fig. 1; it has a single fairly sharp peak at 


T 
he graph of f. (v) plotte 
26) we find 


a 
value of w approximately equal to $7. Also on integrating ( 
lla 2-6(a + 0°34185)? + 0-09617 
mF = 0-0564: SI) (eck -7917 
uli ORCAS ( 9 ) + 0719173108 5:55 0-34185)? + 0-09617 
0:38465x 
0-15385 — 2?’ 


(27) 


+ 3-33039 tan 


Where x = tan (30). 


12:0 
10:0 


80 


0 40 80 120 160 20:0 240 
2400/7 —> B 


dall’s Series 16. 


Fig. 1. Spectral density function for Ken 
.,23 are given in Table 1. The corre- 


a Values of m[F,(;+1) — F.(o;)] for i295. 
nding values of 

(wP E (ei) — F.(1)); 
(oP HEO) — F.(o9) 


Sp 
= 4{ Ao, (Oir) — Aon 
Q= A4(By, (0143) — Pon 


anq 
T,= Ps Q; are given in Table 2. 
23 r ; . 
The likelihood ratio criterion M = X {h= 2log T} + 48(log 2— 1) = 24-34, which is 
i-o ' . 
red to a Y? distribution with 24 degrees of freedom, 
the correct conclusion that the observations are 


1 of the series being the specified F, (w). 


Clea, 

8 "ii not significantly large when refer! 
kr ‘at with the M test we should draw 
Sistent with the spectral distribution functior 


266 Goodness of fit test for spectral distribution functions 


If the variance of e(t) had not been specified, we would have used the homogeneity of vari- 
ances criterion J’ = 23-67 with 23 degrees of freedom, which again is not significantly 
large (the difference M — M’, with 1 degree of freedom, which gives a test for o? = I against 
alternatives o?+ 1, is also not significant). 


Table 1. Distribution of spectral ‘power’ over intervals of length zi 
for Kendall's Series 16 


i LP (a +1) 7) F (aim) i | TLF (zs 1) 7) F (gai) 
0 1:0220 12 0:0763 
1 1-0953 13 0-0576 
2 1-2440 l4 0-0448 
3 1-4421 15 0-0361 
4 1-5558 16 | 0-0299 
5 1-3799 17 0:0255 
6 0-9734 18 0-0223 
q 0:6029 19 0:0199 
8 0:3668 20 0:0183 
9 0:2309 21 0:0171 

10 0-1523 22 0:0164 

11 0-1056 23 0-0161 

Total 10-5513 


Table 2. Values of test statistics for Kendall's Series 16 


i P, Q: T; i P, Q: Ti 
0 0-1949 2-2857 2-4806 12 0-0001 1:4326 14327 
1 0-4505 0-6195 1:0700 13 0-0053 2-8015 2-8068 
2 1-0072 2-8562 3-8634 14 0-1729 0-0048 d 

3 0-0134 0-0336 0-0470 15 1-7855 0:8365 2-622 
4 0-1471 0-6799 0-8270 16 0-4132 0-7761 1180 
5 0-8415 0-1389 0-9804 17 0-7310 0-0200 * pees 
6 0-5004 48900 5-3904 18 0-5207 0-7617 ue 

7 0-0069 0-2930 0-2999 19 0-1436 4:1721 4:315 
8 0-2438 2-5375 2-7813 20 0-0001 2-3556 2-3557 
9 1:0291 0-4558 1-4849 21 0-0027 0-0886 0:0913 
10 1-4156 0-5018 1-9174 22 | 0-6150 0-8077 14227 
11 0-1182 0-1355 0-2537 23 0-0400 0-4936 0:5336 
Total 10-3987 29-9782 40-3709 


Again we have 


23 
U=-2 log 2y, = T,— 23 log 1—e-47; —481og 2 = 40-63, 
à 98 Yi noia tE queilog2 ( ) g 


——— € x. í—————— ————— — — a ——— —— 


"v 


A. M. WALKER 267 
with 48 degrees of freedom, and since there are 12 values of i with T; > 2 log 2, 


U= Y 17,—241og2— 2439, with 24 degrees of freedom, 


t 
Ti221og2 


X log(1—e-7)—241og2 = 16:24, with 24 degrees of freedom. 
2 


Neither y, U, nor U, is significantly large. 
As a further check, values of the corresponding criteria U, U,, U, for the sets {P} and {Q} 
or aPproximately independent variables each distributed as yj, were obtained. For the 
set, U, = 41-17 (30 .r.), U, = 7:05 (18 D.F.), U = 48:22 (48p.r.), and for the Q set, 
U = 13-95 (14 p.r.), U, = 30:79 (34p.r.), U = 44-74 (48 D.F.) The values for the Q set 
Wree closely with their expectations, but for the P set, U, is greater than the upper 
10% Point of T while U, is only just greater than the lower 1% point of Nas 


This indicates that the values of P, tend to be too low, which is confirmed by the fact 


that * P, = 10-40 is less than the lower 1 % point of Xs. This effect is somewhat surprising. 
i-o 

The exact expressions show that the values of E(P,) tend to be less than unity, but the 
Negative bias does not seem to be large enough to provide a satisfactory explanation. 

detailed examination of the accuracy of the X? approximations in this particular case 
Right be of interest, but has not been carried out because of the extremely tedious calcula- 
"Ons Tequired. There is of course always the possibility that the effect might be traced to 
Some Peculiarity in the series of residuals {e(t)} (compare the experience of Bartlett & 


‘Jalakshman, 1953, p. 120). 


6. DERIVATION OF THE APPROXIMATIONS OF $2 


m 
55 easy to show that 


Am [As (99) -Anlo = I KB) - DEG dE, (u), (28) 


T 2 , 
47° E[Bs, (92) — By, (v) -f [Ki (u) + Lr (u)? dF, (u), (29) 


Lelu) Leu) — Doe (u)] dF, (u) 


-["pesq)- 
(03) (Ass (09) — Asa (071 } o (30) 


152 " 
4 (CA, (o) —A 
anq 


ES ID 
rpg (v4) — Bs, (v3) {Ban (2) = By, (31 J f. [ 


2n 


Ju) + D: (u) LAG (u) + D (w)] CF, (u), 


(31) 

Where E ade a (s2) 

) Er NE G,,(u+ 91) — Gn(U+ 2), (33) 
n being defined by usin (n+ 39ay, 

Gu) =}, 2sndy (34) 


anq K (u) = K (u, t, 04); I8u) = L, (u, 05, %4). 


268 Goodness of fit test for spectral distribution functions 
For example, | 


Ax BLAs (05) Anlo) S = Se ‘) ( ee °) ije cos u(r —s)dF,(u) 
T,S——n" 0 


y S 
f. 


n 2 
> [eter (efor — eto — eiur + e-tr)/(2r)] dF(u) 


r=-n 


= [7 Otuto) — 6,00) — 6, (us) +G (uo) PAE (Os 


since > (cir —1)/r = iu zif" (3 » ; cos yr) dy = JN sin(n-- Dy dy. (38) 


r--n sin dy 
(29), (30) and (31) are obtained in the same way. 


Now Galu) = i eris dy +f" sin (n 4- 3) vÍ 
0 


T 1 
2sin dy a a 
= Si(n--3)u--L(u), say, (36) 
where by the Riemann-Lebesgue theorem (see, for example, Titchmarsh, 1944, P- ee 
I,(u)> 0 like n~ as noo. In fact, on integrating the second term of (36) by parts, We fin 
that for |u| <7, 


| Z.(u) | € siu -*) < 0:37/n. qm 
When the argument of G, exceeds 7, as can occur in (33), it is convenient to use the relatio? 
G,(27—u) = 7—G,(u), which follows at once from (36), and to write 
G (u) = 7 —Si(n +4) (207-4) +I, (u), | 

where 7, (1) still satisfies (37). Hence if 

klu) = k (u, €, 3) = Si(n+ 3) (w—w,) — Si(n + 1) (u— wg) 

ln (u) = p(t, w1, Wy) = Si(n-- 3) (t9) -Si(n--3) (uto) for 0«usm- o» 
a similar definition applying for u > 7 — ws if Si(n + }) (u +%;) is replaced by 

7—Si(n--i)(w-o;) when w»77—o; (i—1 ,2), 


k!?(u) and 3?(u) will be, for moderately large n, good approximations to K!?(u) and Lt u) 
respectively, and similarly k} (u) and l% (u) will be good approximations to K*(u) and L5 P 

Now Si(n--1)w may be replaced to a good approximation by 4a when u> As] (n - 9^ 
A = O(1). In fact, since 


?sinw cosy siny (*2sinw cosy | 20, 
| aL a 9 °° oF us du = Pig with |0,|« 5, 


2 
we have | Si(n-- 3) — $7 | Sm tne u>Ar|(n+4), 


H [U^ 
so that with A = 2, for example, the error in the approximation isnot more than about 10 o 
It follows that to a good approximation we can put 


i 38) 
Uru) =0, (0<u<m), provided that «mm, m—w»|n, ‘ 
- l =0, uco,—An[n, U> 05 4- ÀTjn, 


=m, o*AT[n&W& 0,— Ànn 


(39) 


“and l 7o ) (for w,—w,> 2Am/n). 


[ 


A. M. WALKER 269 


Hence (28) and (29) are both approximately equal to 


ü geo) dF, Qu). 
J0 


and therefore to 
+Aain 


2 oi 
TIF (oy — Arin) — F.o, +Àrn)]+ zl » {kru dF(u), 
=. vi- n 
from which the result (11) of § 2 follows under the stated conditions. 
Also when approximations similar to (38) and (39) can be used for i33(u) and k# (u), (30) 
and (31) are approximately equal to Í 7 22(u) Iu) dF, (u), and, using Schwarz’s inequality, 
: 0 


| ý 4 —A7z[n " 
[5 moneo] [f rotor to [eiut 
sn 
| Where Es il " gu) Fo). Ta- f; Qr dF(u), 
T" a 
50 that for w — o > 2A7 |n, 
|f Jiu) o (ur) dF,(u) | & (Vi Va). (40) 
0 

0) will also hold for ws— wg « 22r/n if 
| [s 122(u) KAU) AP) | K4 Vae Yaa)» 
| oy —An[n 

Which is certainly true when 

Ws+An|n 
Í batai gej aE) Vi Í pn (ES (u) JF, (u) «Vas. 
uy—AnIn i 


The b 
result; of $2 then follow. e , T 
he ia ios d (12) when the end-point of an interval coincides with 0 or 7 can be 


Justify ment: 
1 ed by a similar argu” en. 3 4 
T 4 Eas e discu ssion rigorous we require expressions for upper bounds to the 


ž Ors in the approximations. These are easily obtained, although their form is somewhat 
°mplicated. For example 

Am? B{Aan (2) — Aon 

om k and l without ambiguity), where from (37), 


(o) f 601,00) 3,00) dE, (0) (41) 


(dropping the superscripts fr 
"ELS 


Now (41) is equal to jh à u) aE +BY + Be (42) 
0 
i - (u)—1,(w)) dF. me 
Where RO= L saar) 2, Ox (ke, (tu) — L, (2) AF, (u) " AG 
m Es $^ Ce ‘i . 
L,(w) b, QU) dF(u). e QR) jS 


Shi ng- edF. en? P 
0 ^ =, 


270 Goodness of fit test for spectral distribution functions 


a ie kalu) dP, (u) =7{F, (00) — F,(o,)} + RO + B®, (45) 
where 
&y— Àz[(n4- 3) 7 €s— Az [(n-4- 13) E 
mE (J +] Jio dF,(u) + | (wn) aB,(u) (48) 
0 03 Az [(n.3- 3)/ 0,4-Àz[(n4-3) 


Oy @2+An/(n+3) 
and R®= (| +Í ) ks (wu) dE, (u) 
03y—AÀ7[(1 4-3) 


9- Àz[(n4-1) Oy 
+ ( | if f ) (12 (u) —2)d E, (u) 
o os—Ar[(n-3) 


Aa|(n+4) (47) 
- [t 2) {AP (04-44) - dE lo), 
Ad 
T—— Wind, of i kalor +u) =k? (v, — u) m (48) 
kilo +u) —1 (u>0). 


Hence the magnitude of the relative error in the approximation does not exceed 
PIECE M lp 


Upper bounds for the first three remainder terms (43), (44) and (46) may be derived bY 
using the inequality 


| Si(n--3)u— ym | <4 ata ét), say (w>0). 


This gives |L(u)|« PECES Jmin(otw,27—o,—w)]«L (O<u<n), " 
ES 2 51) 
where (na p min (o4, z — w) [i+ ta +ł)min (o, 7— al (5 
[00e X dle DG pe Ks ucuna " (52) 
where 1=4/(3A7) + 20/(9A272) (53) 
(assuming that w — w; > 2Az[n), 
| kn (a) |< $ SEN (HON < Ky msan (54) 
and | E, (u) ^7 | < piln 4) (u— 093] ó[(n4- 3)( (to, — u)] y 
< Ky, Oy +An|(n+})<u<o,— —An|(n+4), (59) 
where Ky=2/An+ 4/272, (56) 
Hence we have 
[R |< C5] Fem) I) |LR, (m). INIT ar, qo), qe 
| RO | c LF smerf” | kalu) | dF (u), | (58) 


with fi k,(u) | GF, (u) « Von es Xn] (n 3)) R (o, — Am](n 4 )] + KE, (m) 


(| 5.) | < 1-27 for all u), (69) 


A. M. WALKER 271 
and | RO < KIE, (s) — F, (oat An(n +p) -F (o Ann] 
X UG 27) Lf, (os — Ar] (n9) -Elo Ar] (n1 
< K3F, (7) +K,(Ky+2m) [F. (03) — F.(91)]- (60) 


E the fourth remainder term (47), we note that to a good approximation W, is in- 
a Pendent of w, — w,, increasing from 0 to 17? as u increases from —Az/(n+4) to 0, and 
Om — £7? to 0 as u increases from 0 to Az/(n-* 1) Hence R® is at most of the order of 


5 [Fot An[n) — F..(9;— Ar[n)]- 


i=l 
* fact we have 0<W,(oj— opt) «re ome (<0) 
0» Wo- upu) > Gr-d0mP-7* (20 
(Provided that jr > (Am), which certainly holds for A> 1), so that 
Ho — An |(n 4-3) + Feet An|(n+4)) — F (3). 
— F (03) + F.(93) — F,(@,—An|(n+4)}), (01) 


|< max [fr da) (F9) — Fs 
m- Qa — Qr {Flor 0r] (n3) 
°t, When ¢(Az) can be neglected in comparison with 47, 


| R® | e n? max [HF (01) -F01 Atl”) +F, (w+ Amn) — F+ (9), 
HF (ot An/n) — F, (03) + F(@2) — F, (0 — Az[n)]]. (62) 


nd, U say, for (49). For U to be small, we certainly 
Ust take A ; ter than unity, since the term K,(K; + 27)/72> 2/(5A. 

t ciably gre& : 2 ) 

Cours: the dete E A T. 4, for example, is 0:12. However, A must also not be too large 
cause of the contribution to U from (61) or (62), which in general will be small only for 
2— 0,5 Agr|[n. "Thus the condition that U is to be small may impose a fairly severe re- 


Striction on the minimum admissible length of interval «9 — 9. The contribution from the 
al to 


maining terms in U, which is eq" 
[F (02 Ar] (n p) —F,(91— An] (n 3-3) BLU (2) — £(0)), (63) 


2L+3/n)]/7?, C,- 12 (9L 3/m)jm, 


From (57)-(62) we obtain an upper bow 


(C, F, (7) + C2 
pes Q,- KL 15n)? 4 KG 
Will usually be negligible unless Fg) — F 01) 38 a very small fraction of F,(z); e.g. if 
As 4, w, and z — > 167][n. n= 200, we have O,< 0:003 and C, < 0-04. 

tom the behaviour of the functions W,(u). Lu) and ku) it is ebay tomes diu U wil 
Often be a very conservative upper pound, so that the approximation may be quite accurate 
Sven when U is not small (this is the case, for instance, in the example of § 5). However, 
9 obtain a better estimate of the relative error jt seems to be necessary tounaksdone de- 
jour of JF, (0)- For example, if F, (v) has a derivative 


ai 

lled assumptions about the behavi 4 

+(©) which is continuous over an interval covering (wy, 99), we can show that the approxi- 
ation may be quite good even when 93 — #1 38 only of order z/n. In fact if we have 


w 3 
o= + 


<e for «ó, 


noA”) 


272 
where ô> (A+ 


Goodness of fit test for spectral distribution functions 


tu) 7/(n +3), with = (n+ 3) (o5 — w,)/7, and assume that c may be Pr. 
we find that the contribution to the relative error of R and the last integral in R®, fro 
which the dominant terms in U usually arise, is 


(2/79) | mao dy) du, (05 
0 


A ing the 
which is, for example, «0-10 when 4/24, A» 2, [This follows quite simply by using th 
results 


fi Sodu=1+20y, f'Sta- nyo qoh olen. 

0 J 

where S(w) = 47—Si(u).] M. 
All the above applies equally to the approximation to 477° E(B,, (05) — Ba, (w,))^, since by 

only have to change the sign of 1,(w) in (41). The expressions for the covariances, give? i 

(30) and (31), may be dealt with in much the same way; we may note that for wat a 

can be shown, by considering an integral similar to (64), that taking these to be zero ™ 


as 
Det Ni. 
be a good approximation for (0g — 04, 04— w of order z[n when flw) can be treat 
constant over an interval containing w,. 


| 


APPENDIX 


Asymptotic power functions for the tests of $3 
(1) The U test 


We assume that to a sufficient] 
tributed independently with prob 
be distributed independently witl 


se dis- 
y good approximation, T,—20z, where the c; athe 
ability density functions e~zi (0 <x; < co). The y; W! 
1 probability density functions | 


E 
(Q-re)i-(-y)9 (O<y<}; 1=0,1,...,4— 1), | 
where €;= (1/52) — 1. 


Let v; ju; — log 2y;. Then from (1) we have 
EQ) — (14-6) [ar ii "opened do, } | "ee" (1 — dee «| 
0 2Jo 
=(1+e THH 1) 


oes) orem(i + X Pale Ijli es 1) qat) A 


=1 


L3 (6, 1)... (62-8 -E-L (2) 
=A +e) Te0) [ay eem. Ecrit 
he . 4 
(the interchange of the order of integration and summation being easily j ustified). wt. 
0 have a positive lower bound and a finite upper bound. Then from (2) we easily ie that 
the third absolute moments of the v; about their means have a finite upper bound an f the 
the variances of the v; have a positive lower bound. Hence by Lyapunov's form 9 here 
central limit theorem (Cramér, 1946, P- 215), the distribution of (V — E(Vyy/ (var V), W 


k-1 3 : eo 
V= 3v; —1U, tends to the Normal form with zero mean and variance unity when k> 
i=0 


el 


A. M. WALKER 273 


Let V, 
TA n the critical value of V when the significance level of the test isa. Then for large 
Jk, since E(V) - k— var V when ¢;=0, and the power function Pr (F > V) is 


approximatel 
y equal to 
E. 1— O[(k+ Eak- E(V)) A Gear V))- (3) 
W when the e; are of order k-4, (2) gives 
peal eo 1 
B(V)=k+= X elogi- X x12 
(ne bes Zae- È aes ya tO) 
k-1 
=h—0-42 X e;+0(1), 
i-o 
and 
var V —k--O(JKk). Hence (3) then becomes approximately equal to 
k—1l 
- o(é. +0-42k-4 Ze) " (4) 
2 d 
Q) The modified U test 


The pr a 
he probability density function of y, 
(gated (1-6) y? (0« y; « 9- 


conditional on ; « lis 


He 
8i nce Eoi p.e 169) Tr i 
Imilay]y we have 
Ü T a 
E(t | p; 2 3) (1-6) lr «(E 3:5)" eis eee a- = d 


Tt t} 
a ome follows, by the central limit theorem, that when the 9? have a positive lower bound 
BL finite upper bound, the distributions of (U;— -E(N (var Uj) (j=1, 2) conditional 
um Specified values of i with pe > 4, tend to the Normal form with zero mean and variance 
NN ky and k, 2 k— kı > 
k be large and the 6; be o 


Pr(p;c$ 
We have ' 
B(k,) 2 4e OW var kı 


robabilities 7; 


f m 1-3. Since 
y= qiie 40 6 log 1) +O(k); 
=}k+O0(1), 
to > A k, are O(/t). The conditional p = Pr (Uj» 2(k; + A4k;)) are then given 
approximation by 
mel- 0-9 1615 32, 6j); 
m71— (a+ hy? E26) " 


v X, having the same meaning ce from (5) and (6), 
E(Z,v)- 5-016 z,6 4 O(1); var Zv; =k, t OE). 


and 
E(Z,v) =he— 22% 4-0(1). var gv; — ko (E). 


al value of U; for the modified U test. From (7) with 


hen the significance level 4&1, A-£j,. The power 


(à 


as in $3) sin 


N 
PM let 2(k;-+Aa/k;) be the critic 
find; Ae O(a), so Ui Y 
lon ot the test is 
Em 4pm — TTo)» 8) 


18 
Biom. 43 


274 Goodness of fit test for spectral distribution functions , | 
where the expectation is taken over all possible sets of values of i. Although (8) is a Very | 
complicated expression, its asymptotic form is easily found. For 


E(,e)- X Qe l S 6, 100), 
i-0 mi 
k=-1 asa RS 2 ET 
and varX,6;— E (Ha f1— (gee = 1 PE T O(k1), 
so that 7,71— 60(A—0-160//2), 7,2: 1— (A+ 0],/2), 


k-1 ; 
where 0=k-! Y; e; Hence (8) is approximately equal to 
i=0 (9) 
1— (À — 0-160/,/2) (A + 0],/2). 
— ; = S. 
Clearly (9) approaches unity when 0 takes sufficiently large positive or negative V alue 
(3) The M test 


With s$— oz;, the cumulant generating function of M becomes 
k-1 eo 
log Z(exp M¢) = > log [es I €77 02679 (5 3:)-29 s] 
i-0 0 
k-1 s 
=k üogT (1-24) - 29)— X [26 log 9 + (1 — 24) log (1 — 2901)] 
i-0 


=| » Leon) = 24] +S [a — 34) 5 Cri) — zóloget| y 
ral f i-o mao" 


r=] 
where 1/9 (a) = (5) logT (x). Hence the cumulants of M are given by 
k—1 (10) 
KM) = —2ky(1) 425 (o2— 1—1og c2), 
i-o 
k-1i a!) 
KAM)=(—2 ky (1) 4 r2)! F (ciy3(r—1)e$—7) (r>1). 
i-o 


It is easily seen that when the o? 
distribution of M tends to the Nor: 
have from ( 10) and (11) 


d, the 
and their reciprocals have a finite upper n ,we 
mal form as ko, Also if the e; are of order 
, k-1 

E(M)=1-15k + Y, e - O(kl), 
i-o 
, and var M = 2-58k-- O(1). 
'Thus for large k, M, 


e 
m fican? 
= L15k + é, ./(2-58h), M, being the critical value of M for sign! 
level c, and the pow 


er function Pr (M > M,) is approximately equal to 


k-1i 3 
1—-d6;£ — PENEI : 


A. M. WALKER 275 


REFERENCES 


pones M. S. (1937). Proc. Roy. Soc. A, 160, 268. 
Eo M. S. (1950). Biometrika, 37... 

E M. S. (1954). Publ. Inst. Statist. Univ. Paris, 3, Fasc. 3, p. 119. 
Pm M. 8. (1955). Stochastic Processes. Cambridge University Press. 
Box eu M. S. & RAJALAKSHMAN, D. V. (1953). J. R. Statist. Soc. B, 15, 107. 
aes E. P. (1949). Biometrika, 36, 317. 

Deon E: H. (1946). Mathematical Methods of Statistics. Princeton. 

Biim - L. (1952). Stochastic Processes. New York: Wiley. 

een" R. A. (1929). Proc. Roy. Soc. A, 125, 54. 

Eu" R. A. (1941). Statistical Methods for Research 

Eo me U. (1951). Ark. Mat. 1, 503. - 
MD NDER, U. & RosExBrATT, M. (1953). Ann. Math. Statist. 24, 537. 

Riso dits M. G. (1949). Biometrika 36, 267. 

Tros: J. D. (1953). J. R. Statist. ‘Soc. B, 15, 140. 28 

MARSH, E. C. (1944). Theory of Functions, 2nd ed. Oxford University Press. 


Warr 
Warme P. (1951). Hypothesis Testing in Time Series Analysis. Uppsala. 
Winter’ P. (1952a). Trab. Estadistica, 3, 43. 
Waite, P. (19520). Biometrika, 39, 309. , E : i , 

LE P. (1954). Appendix to A Study in the Analysis of Stationary Time Series, by H. Wold, 


?nd ed, Uppsala. 


Workers, 8th ed. Edinburgh: Oliver and Boyd. 


18-2 


[ 276 ] 


SUFFICIENCY CONDITIONS IN REGULAR MARKOV CHAINS 
AND CERTAIN RANDOM WALKS 


Bv J. GANI 
Nuffield Fellow, Statistical Laboratory, The University of Manchester* 


For positively regular Markov chains with a finite number of states, transition probabilities of the form 
Pis(9) = æi; exp (K;; A,(0) A40), 

are known to admit a sufficient estimator of @ in realizations of the chain starting with a fixed state 

and consisting of a fixed number of transitions. ! I ficient 

This paper considers whether transition probabilities of the same form will admit A ‘sul ae 
estimator of @ in other finite regular, but not positively regular, Markov chains. For chains wit n f 
irreducible subset of two or more states, in which a realization starts from a fixed state and consists a 
a fixed number of transitions, these probabilities are found to admit a maximum-likelihood estimato 
of the function g(0) = —A$(0)/ A1(0), which is sufficient and unbiased. fixod 

There is some difference in chains with an absorbing state, in which realizations start from a AX! d 
state but continue until the absorbing state is reached; in this sequential case, the maximum-likelihoo f 
estimator, with the number of transitions in the realization, together provide a sufficient estimator 0 
the function 9(@), which in general is no longer unbiased. 

We restrict ourselves to the particular case where a certain | 
to some simple stochastic matrices admitting a sufficient esti 
of the forms 0 and 1—9; in some of these cases, 
results of Girshick, Mosteller & Savage (1946). 

Some non-regular finite and infinite 
whose matrices consist of vi 
sufficient estimator of 0. The 
estimation. 


" sod P "eR! riso 
inear relation is satisfied. This Me 
mator of 0, which consist of. probabi A 
unbiased sufficient estimators of 0 reduce to kno 


A " " <8, 
chains with absorbing states, associated with random seal) 
arious patterns of probabilities 0 and 1—9, are also found to uer 
paper ends with the examination of such an example, arising in sequen’ 


l. INTRODUCTION 


; - ing 
In a recent paper (Gani, 1955), the author considered some sufficiency conditions woe 
for positively regular Markov chains with a finite number s of states Ej, ..., #53 for 


chains, stationary probabilities exist and are all positive. d 


It was shown that, for a realization of the chain starting with a fixed initial state pec 
consisting of a fixed number n of transitions, a sufficient estimator of 0, the guid 
defining the transition probabilities p,,(0) = Pr {E; | E} (i,j — 1, ..., 8), existed if in any " 

t of the stochastic matrix P = (7,5), these probabilities were of the form 


Pa() = asexp {Ky A0) +A} (j=1,...,8), 
where the constants Qiz 


o 


; Jd be 
> 0, and the number r of distinct exponents K,, in the row onu 


A (0) 
less than or equal to the number s of states. Since » Dij = 1, the functions A,(9) and ^ 
were related by the equation E 


8 (2) 
exp {—A,(0)} = &;; exp (K;5A,(0)); 

"NES Arbo 
from this it followed that the transition probabilities in the remaining rows of the ) corre 
except possibly for coefficients, were also given by the r distinct forms of the p;;(9) 
sponding to the r distinct values of the Kj in row i. 

n Nation?! 


* The greater part of this work was completed while the author was at the Australiar 
University, Canberra A.C.T., Australia. 


p" 


J. GANI 277 


4 im 1 whether a form similar to (1), for the transition probabilities admitting 

E estimator of 0, exists when a chain is regular, that is, with stationary pro- 

am Cercar of which may be zero. It is known (Bartlett, 1955, $ 2-21; Feller, 1950, $ 15-5) 

Te gular chains, other than those which are positively regular, have stochastie matrices 
simplest of which may be written as 


S 0 
»-(o x) e 
wher 
here S, a square submatrix of transition probabilities depending on 0, constitutes a single 


eem closed subset, such that the states in it form a positively regular chain. The 
Tiene Q and R also consist of transition probabilities depending on 0, and 0 is a 
m of zero elements. The case of particular interest, which opes in sequential 
Deest is that in which the irreducible closed set consists of a single ‘absorbing’ state 
I g entry but no exit; S is then the single element 1. 
ee lei cases: where the closed set consists (i) of two or more states none of which is 
of the > > (ii) of a single absorbing state, we shall consider two distinct types of realizations 
tiong E er. For the first, exactly as with positively regular chains, we shall take realiza- 
consisting of a fixed number ” of transitions and starting from a fixed initial state; 
Sd. ped end with any one of the accessible states in the chain, nor does the possible entry 
an m into the closed set terminate the process of transition from one state to 
an, ce of it. For the second case, however, à realization starting from a fixed initial state 
an nsisting of a similar fixed number » of transitions could result in the absorbing state 
6 ed in «<n transitions. In such cases, as is usual in sequential problems, we 
teac} er realizations in which the process 19 allowed to run until the absorbing state is 
ched; the number of transitions € required for this to happen is then itself a random 


Vari : f 
able. We proceed to examine these cases in greater detail. 


BLE CLOSED SUBSET OF TWO OR MORE STATES 


2 
* CHAINS wiTH A SINGLE IRREDUCI 
ith a single irreducible closed subset of t > 2 


iet á 
" US assume that the stochastic matrix (3) w 
°S can be written : 
Pu Pu | 0 Ut 
| sls 0 
S | 0 Pa = Pu 0 i m 
-GH 
Q | R fui c Dirt | Prati c7 Duas 
Pst | Ps, +1 ARR Pss 


Pa 

wl i 
i lere the transition probabilities are pij — pij (0). some of which may be zero, and where 
Mates E, ..., E, form a positively regular chain. It is always possible in a sufficiently 
s sees ly i ..,E, into the subset of states 


"Be 

p. number of transitions to pass from any state Enr «P k - D 
m dis i xit from it i: A 
vod E, once the system has entered this closed subset no exit tror s possible 


9nsider now a realization of the same kind as that for positively regular chains, that is, 
" e Starting with a fixed initial state and consisting of a fixed number of transitions. If 
Started with one of E E, we should be dealing with the case of the positively regular 

Hy, e Bp 


lay 


E 


278 Sufficiency conditions in regular Markov chains 


chain already investigated; we therefore begin with one of E, ..., Z,, and obtain a realiza- 
tion S of the n+ 1 states E, By 2 By B, 


for which the likelihood function is 
L(8) = x nij N pi, 
ij=1 


where the n;, are frequencies of transition from E; to E, such that Y Ny =N. 
ij-1 : 
It can be verified directly that the form (1) for the p;,(0) will satisfy, exactly as m the 
positively regular case, the sufficiency conditions for the maximum-likelihood estimator 
T of 0. For the likelihood function is then 


L(8) = x nij K;5 A (0) + nAS(Q) + X njl1naj, 
ij=1 ij=1 


j= 


and gives the estimator T' of 0 as 


Ey Kyln  —ÀAQTYAJT) = g(r), 


ij-l 


so that L(S) is clearly factorizable as 
LIS) = MIT) A0) -A40)--. S nna, (6) 
ij=1 


`. 


It follows that 7 is sufficient; moreover, as in Rao (1952, $4a.3) it can be shown that 
E taKul = — (TJA) | 
is an unbiased estimator of 9(9) = — A:(0)/A:(0), so that 
eng Kun) = g(0). 


A point of minor interest which fo 
number of distinct forms for the p 


there can only be r «t distinet fo 
matrix, 


; he 
lows from the structure of the matrix (4) is hd à | 
is(9) in any one of the rows 1, ...,tis r<t; it Salm 
rms for the p, in the remaining rows t+ 1, ...,$ O 


3. CHAINS WITH A SINGLE ABSORBING STATE 
A typical stochastic matrix with an absorbing state FE, can be written as 


1 gu 
p (6) 
p- Po Poz Pos $ 
Dau Pa e. Pa | 


if 
D;;(0), some of which may be zero, are such that? | 
-+ E, is allowed to continue, E, must eventually. 
in the process being z, a random variable W 
his type of realization as sequential. 
quence 5" of states 


DE E PN Oe E 


Let such a realization result in b se 


J. GANI 219 


the likelihood function is 
L(8') = Xnjhnpi (i=2,...,83 j 9 8) 
uj 


wl ze : 
here the n,, are transition frequencies from E, to E;, such that X ni; = v. Tf the transition 


es " tJ 
probabilities p,,(0) except for py, = 1 are of the form (1), the likelihood function 
L8) = Z ny Kij A(O) 29240) + X tiyn dig (1) 
ij 1,j 


Vill give the maximum-likelihood estimator T, of as 
Lj Kyle = —Ax(T,)/Ai(Te) = 9(T2)- (8) 
Si "T a4 
ince z is itself a random variable, the likelihood function L(S’) of (7) is not factorizable, 


a 
E- L(S) of equation (5), nor is T, à sufficient estimator of 0. 
,À lemma due to E. Fay, quoted by Lehmann (1950), and proved under somewhat 


different, conditions by Blackwell (1947), enables us to deduce that (T,, v) is a sufficient 


Statisti 

| p" for 0. The lemma states that: 

| or each value m of x, Tm is à sufficient statistic for Gin the sample of fixed size m = X Mij 
of 0 in the sequential case. ij 


th 1 
on (5, x) will be a sufficient statistic 
&v or in the sequence S’, starting with E,, the conditional probability of the sequence, 
en that T = tanda = m, is P (s 7 í } 
p» u r f m ENS qu m 
Pr(S'|T, 227 m]  $pr(S, T, = be = my 
S 
(mg(t) A0) mA,(8) + D 1,,1na;j) 
un 


(t) A4(0) + mAz(9) + Eng c) 


exp 


" D exp {mg 
exp (Zinn ag) 
B uenceu ng (9) 
S exp {X ny Mas) 
S' ud 


wi X " . : " DNA 
here D indicates summation over all possible realizations S' beginning with E; consisting 
9 . Ca 

of m transitions, and for which Zn = t. Since the probability (9) is independent of 0, (T, v) 


a, à : 

Sufficient statistic for 0 in the sequential case. —— 

u = Beneral, however, the maximum-likelihood estimator (8) of g(0) will no longer be 
nbiased, We shall nok peer general methods of finding unbiased sufficient estimators 
Me 


the simpl and « are linearly 


of 
9(0), but restrict our discussion to e case where P ng Kij 
unbiased sufficient estimators reduce to those 


Telate d 
fo - We show that in some of these cases, Bae 1 
ES by Girshick et al. (1946) for certain sequential binomial problems. 


Ky AND $ ARE LINEARLY RELATED 


relation between the statistic 
Let the relation be 


(10) 


Par: 
. oo is a linear 

rly si follow when there is elati 
uda a sequential realization. 


n, 


iJ "uK, and the number of transitions ? in 


ying Ki = Ax B, 
ij 


4. REGULAR CHAINS TOR WHICH ni 
we follow Lehmann and 


E j 
k is f-explanatory 3 
the though apparently not in universal use, the term 1 gel 


m employing it. 


280 Sufficiency conditions in regular Markov chains 


where A and B are constants independent of x; we see that the factorizability condition for 
the likelihood function (7) will now hold, since 


L(S') = (AA 0)-- A0) + BA,(0) + X nln ay (11) 
1j 
and the maximum-likelihood estimator T, of Ü is given by 
— ASIAN) - A1) = a. 
It is easily shown that ; í 
E(x) = (A0) (4 A40) A40) = A(O), 


. s jons. 
so that z is an unbiased estimate of A(0). We consider some simple examples as illustration: 
The simplest stochastic matrix with an absorbing state is that for the two-state case; 


1 0 Jef 5 s) 
lj ——— exp (Ks,In ((1—0)3 +n (1—0)) nid p^ 


s rom F 
where Ky, = 0, Ky, = 1. For a realization in x transitions, the system starting from #2 
must move from E; to Æ, only in the last transition. The relation (10), which is 
Di Ki; = (t—1) Ko Ka = 2-1, 
ij 
is clearly satisfied, and the maximum-likelihood estimator T, = (w—1)/a of 0 sufficient. 
The three-state case, with the stochastic matrix of the form 
1 0 0 
P= [exp KoA +À cexp KogAy+Ag Gasexp Ky A, As]; 
Ca EXP Ka A, -À, ouexp Kg. Gasexp Kos A, +A 


is known to have the same two or th 


H B 1 ow. 
ree distinct values of the exponents K;; in each enn 
with some of the &;; possibl 


; ie ne chain 
y zero. It is found, on considering realizations of the 


Starting from a fixed state E; (i=2,3), that the linear relation (10) will hold only tor 
stochastic matrices 
1 0 0 (13) 
p-[1-0 «0 (1—a)0 (0x o, f « 1), 
1-0 f0 Q—f)0 
when it takes the form EngK;-z—1,or 
tJ 
H 0 0 (4) 
P={1-0 0 of, 
0 1-0 0 


when it is Lj Ky = 2-141, depending on the initial state E, (r= 2, 8). 


tj 
The solutions (13) and ( 
treat the general case of ti 
can easily be verified that 


ES Jt to 
14) are exhaustive for the three-state case, but it is ae 
he chain with s states in a systematic manner. Nevertheles* 
linear relations (10) hold for stochastic matrices 
1 0 LM 0 
1-0 aað .. a0 s 15) 


1—8 da48 .. b 5 


J. GANI 281 


which are a generalization of (13), or 


1 0 0 0 
1-08 g 0 
p-|o 1-0 o]. (16) 
0 lae .. 1-0 0 
a SE te 
generalization of (14), or also stochastic matrices of the mixed form 
1 0 aie 0 0 isis EM 0 
1—0 Bah a a Ò um a O 
1-0 a0 .. CÓ 0 
p= 
0 a ww d 0 NE 0 
0 1-0 6 0 
1-0 0 


k 
(ees 0<ay<1; X dg iod uf (17) 
j=2 
E assume that the initial state in a realization is the last state E,, the stochastic 
Corres es (16) of which (12) and ( 14) are particular cases when s is 2 and 3 respectively, 
i oe to part of a curtailed single sampling scheme of Girshick et al. (1946, $3B). In 
onti atrix (16), 1 — 0 is the probability of finding a defective in the sample, and sampling 
i Dues until s—1 defectives are observed. The likelihood function of a realization S’ 


35 clearly 
L(S') = (s—1)In (1-9) + (@-8+ 1)In6, 


and 
the maximum-likelihood estimator of 0 
Thi T, = 1- (s— He 

8, however, is biased; for, from the probability of a realization 

Pr (S') = ps ) a -0y gro (w= 1...) 
s—2 
ec ; 
an obtain the moment-generating function of x 

and mi = (e) = 1 

from it, by first integrating and then substituting t = 0, we find 


£e (i 1y —sr -r 
(Jroa) = (a3) =- (1 —y lema —0)+ hai A (1—0) |, 
80 th 1-0 T 
E ies expectation of T, is not 0. aedi 
ick et a], (1946) give the unbiase SOE Simi Se 


6 2 1- (6- 2/(x— D- 
) = 1-6)(6—-2) 


d sufficient estimator of 0 for valu 


i 
We 
* Consider Í eaa) =g)" 
[-0 


282 Sufficiency conditions in regular Markov chains 

we can verify directly that the expectation of the estimator 0 is 0. The case of s = 2 is some- 

what special; Girshick et al. have shown that here, Ó can only be 0 or 1. E 
The previous method for finding unbiased sufficient estimators of 0 does not apply 


matrices of the forms (15) or (17); for these, the necessary condition that the probability 
of different paths in a realization of x transitions be the same does not hold. 


5. NON-REGULAR FINITE, AND INFINITE CHAINS SATISFYING THE LINEAR RELATION f 
; ; e 
The condition (10) that the estimator of 0 in regular chains with an absorbing Me. 
sufficient, may apply equally to some non-regular or infinite chains with absorbing E x 1 
for which realizations start from some fixed state E; and continue until an absorbing $ É 
is reached in z transitions. Suppose that, associated with all non-absorbing states S. 
there are only two distinct values of the transition probabilities p;;: 0 and 1 — 0, the wn d 
values being zero. Then, grouping together the n; for which the probabilities are 0, an 
those for which they are 1 — 0, so that 
En-" Ymj-m, ndn =x, 
@ 1-0 
the likelihood function of the realization is simply 18) 
L=n,n0+n,.In (1-6), ( 
and the maximum-likelihood estimator of 0 
T, = m| (n +n) = mje. 
Providing the linear relation (10), which reduces to 19) 
nı = Ax +B, ( 


: . ient. 
(18) is factorizable as in (11), and the estimator 7], is sol 
-regular finite chain with s states which satisfies these conditions is the r8' 
walk with two absorbing end states, with the stochastic matrix 


holds, the likelihood function 
A non 


1-80 0 6 
& Lor 
; ick 1), and ending with Æ, after x trans! 
9, = ġ(æ—i +1), 


For a realization starting with E; (i=2 
the linear relation (19) is 
and the estimator of 0, 


tion? 
0 


(21) 
T, = 4+4(1—1)/a, 
will be sufficient, A similar result will hold if the process ends with Z,. or bing 
The sufficiency conditions also hold for certain infinite random walks with one abs 
state; for the chain with the infinite stochastic matrix 
1 0 eels di ) 
2 
USO A ota e 
p- 0 «1), 
0 1-0 0 8 “a 


J. GANI 283 
pem a realization starts with E, (i=2, ...) and ends with E, after x steps, the relation 
b. ) between n, and x holds again, and ensures the sufficiency of the estimator (21) of 8. 

nother infinite random walk is represented by 


p 0 s HN 
á "AE. (0 « 3). 


whi ed. 2 ; F 
hich for a similar realization will lead to the linear relation 
m = }(x—i+ D), 
whi : 
hich again ensures the sufficiency of the estimator 
T, = + (1—1)/(32) 
mal illustrative examples of the same kind can be contructed; for each, the method of 
a niok et al. would lead to an unbiased estimator of 0 which would be a function of the 
tient estimator T,. We restrict ourselves to a single example mentioned by Moran, 
ich reduces to a case already described. 
A Oran (1953) considers the infinite rando 
States are H_,,..., Ho, «+» Egon where E_, 


Or th; Bs 
x ipa chain is (22); the process starts from Fo; 
TM Steps, or any of the other available states F; (t= —(8—1), 

hese two cases are respectively given as 


Eps. The likelihood functions in € 
L- 3(x—s)In04- Ac s)In (1-0) GT 3-52 1)n043(n-5—)In — 6). 


‘i Maximum-likelihood estimator T, of @ is found to be 
wh T,={T9 = (w—s)|(2%) or TO = (n —s4-3)[ (2). 
th Te, since T9» 4- s|(2n), T® = }-s|(2x) < 4—s|(2n) « TA, 

i estimator T,={TP or TO} will itself specify the appropriate likelihood function, without 
ae to state whether absorption has occurred or not. We verify that the likelihood 
ions in the two cases can be written as 
e = 4s(1 - 279-1 m 6(1 — 6) - 351n (1 — 6), K 
tj; the first is effectively the form (18), and the secon 
y Mm TI. of 0 is sufficient, but TQ is not unbiased. 

© method of Girshick et al. (1946) Jeads to the 


of 9. 


m walk with an absorbing barrier for which 
is the absorbing state. The stochastic matrix 
and proceeds until either E_, is reached in 
0, ...,2) is reached in 


L,- Tn 6(1—6)7 4 n1n (1— 0), 
the ordinary binomial case. The 


sufficient unbiased estimator 


sino or TOY of 0, where these are ( n—1 )- ( n—1 ) 
—s+i)) Wn-s-? 
qo-1-S8z 1s) qo a a 4 | 
3s(x— 1) | 7 
Fr G jn-s4i)  dn-5-9 
o 
™ the expressions (s) 4 "m 
= are) Oe 
TO) — , 6D -279) d, T m l 
ett 372) zT (aro) (na -22)-8 


of the maximum-likelihood estimator Ty. 


it i 
8 . 4 
“lear that the unbiased estimator 18 a function 


284 Sufficiency conditions in regular Markov chains 


Iwish to thank the unknown referee of an earlier paper (Gani, 1955) for suggestions which 
have been incorporated in $5, and also Profs. P. A. Moran and M. S. Bartlett, and Dr H. 
Ruben for criticisms of an earlier draft of the paper. 


REFERENCES 
BanrLETT, M. S. (1955). An Introduction to Stochastic Processes. Cambridge University Press. m 
BLACKWELL, D. (1947). Conditional expectation and unbiased sequential estimation. Ann. Math. 
Statist. 18, 105-10. Wiley: 
FELLER, W. (1950). An Introduction to Probability Theory and its Applications. New York: John Wi d 
Gant, J. (1955). Some theorems and sufficiency conditions for the maximum likelihood estimator 
an unknown parameter in a simple Markov chain. Biometrika, 42, 342-59. F ial 
Grrsuick, M. A., MosrELLER, F. & SAVAGE, L. J. (1946). Unbiased estimates for certain binom 
sampling problems with applications. Ann. Math. Statist. 17, 13-23. z ag ofi 
LEmMANN, E. L. (1950). Notes on the Theory of Estimation. Lectures delivered at the University 
California. E Statist. 
Moray, P. A. P. (1953). The estimation of the parameters of a birth and death process. J.R. 
Soc. B, 15, 241-5. 
Rao, C. R. (1952). Advanced Statistical Methods in Biometric Research. New York: John Wiley- 


l 


[ 285 ] 


SOME ASYMPTOTIC DISTRIBUTION THEORY FOR MARKOV 
CHAINS WITH A DENUMERABLE NUMBER OF STATES} 


By CYRUS DERMAN 
Columbia. University, New Y: ork 
for certain functions of the sample realizations of a 


from which the joint asymptotic distribution theory of 
d. Application is made to a goodness of fit test. 


eo asymptotic distribution is derived 
Sati ov chain with denumerably many states, 
mates of the transition probabilities is obtaine 


L 1. INTRODUCTION 
et z : s . 
Xo, X, Xp... be a sequence of random variables which assume only non-negative 


integer s 
eger valuest and which have the property that 


Pr {X m+n =j l Xn = i, Xp Xj 

= Pr{Xnin =j | Xn= }= Pr(X, -j|Xo =} 
Ti eh ntegers m, n> 0 and all states i andj for which the conditional probability is defined. 
A ribtiiun of X, will be fixed but arbitrary. Sequences of this type are known as 
ef Y chains with a denumerable number of states and stationary transition probabilities. 
so "ndamentals of the theory were laid down by Kolmogorov (1936). Feller (1950) has 

8lven an exposition of the main results of the theory. i 
ith ^ Probability given above is called the nth step transition probability. We shall denote 

Y 10. When n = 1 we shall simply write PË) = Piy 

"pose such a model is assumed and it is of interest to estimate or test hypotheses about 
Probabilities Pi; on the basis of a set of observations 23, ---» Cy- We shall develop here some 


R . H 

, gis ptotic distribution theory suitable for this purpose. Results of à similar nature were 
Ein. M by Bartlett (1951). However, he assumed a Markov chain with only a finite number of 
ie Anderson & Goodman (1956) also considered inference problems for such chains. 
thains ia tool will be a central limit theorem of Doeblin$ (1938) for denumerable Markov 


the 


Inu i results from th 
1e remai introduction we shall state, without proof, those results om the 
DUI on probabilities satisfy the relation 


Ben, NS 
im theory which we shall need. The nth step transiti 


Es Es E m n = 2,3, o). 
py = ES Dpyy (5 = 4s ) 


Or 

e Senerally " - 
prm = X prm G3 =0, 1, it fh = 15 ed) 
Th k=0 

® latter equations are known as the Chapman-Kolmogorov equations. 


the United States Air Force through the Office of Scientific Research 


tw 
SE the Lowe Supported in part by o d 
t Command. 


ir Research and Develo 
: pmen h: i 
T ach integer denotes a possible state of the Markov cham. icabili 
n Conn indebted to Dr T. E, Harris for calling my attention to the apioa 
exion with this paper. 


y of Doeblin's theorem 


286  . Some asymplotic distribution theory for Markov chains 


Animportant notion in Markov chain theory is that of the number of transitionsnecessary 
to reach a certain state for the first time given initially a particular state. We let 


: : =, E os ;m$ =1,2,..-)3 
P = Pr {Amn =j, X,,,4jforl1«v«n| X, — i (m,2;3 = 0515.5 ? 


fX my= SA} cg-X(-mewf Quem 
We call m;; the mean first passage time from state i to state j. fi = j, mj is called Pr. 
recurrence time of state i. States i and j are said to belong to the same class if fij 5 iis 
i.e. if there exist integers n(i, j) and n(j, i) such that p » 0 and poo os o. A w ec 
called recurrent if f = 1. This implies that fj = 1 for all j belonging to the aame is 
If f£ « 1, i is called transient. If i is recurrent and Jj belongs to the same class as t, ee 
also recurrent. Consequently, all states of a class are either all recurrent or all T p. 
We can then speak of a class as being recurrent or transient. If all states belong to t pi. 
class we refer to the Markov chain as being irreducible recurrent (transient). en cible 
it might not always be explicitly stated, we shall deal throughout only with irre calle 
chains. If f = 1 and mj; « co, then i is said to be a positive state: if m; = oo, t " Thus 
null. If i is positive (null) and j belongs to the same class, then j is positive (null). 
the states of a recurrent class will all be positive or all null. ; 
i 
If fä = 1, m< œ, then lim 4 $ pe) = + for all states j belonging to the same class a8 


no To y= 


tt 
If m;; = 00, then lim pf? = 0. If all states belong to the same positive class, then 
n>n 


fo j7=0,1,.... 


that 
We call a state i periodic with period t if t is the greatest common divisor of all n such t3% 


: ae: Vrae riod? 
PY? > 0. All states in a class have the same period. A chain, if irreducible, is called pe 
or aperiodic according as t>1 or — 1. 


We shall consider the number of times a state is visited in n transitions. Let 
N,() = (the number of v's such that X, = i for 1<v<n} 
foíc0,1..n- 1,.... More generally we define 
Nalin 4) = {the number of v's such that X, = i.,..., X, a = i forl«v«n) 
where i, ..., i, is a finite subset of all states. 


2. SOME LIMIT THEOREMS 


i T ucible: 
We shall assume throughout that the Markov chain under consideration is irred 
positive recurrent and aperiodic. 


uenc? 
We first state, without proof, a theorem proved by Doeblin (1938). Consider the d | 
of random variables {Yn} attached to the Markov chain (X, ) in the following way: | 
Y, =", if X, =i (i = 0,1, ...), where the xs are arbitrary real numbers. Let 
Ui) = Ynt- +F, 


mM? 
Ul 
where m, is the trial in which state 7 is reached for the Ith time. It follows ag 
(l= 1,2,...) is a sequence of independent and identically distributed random varia 


D 


——— 


— T 


Cyrus DERMAN 287 
Let g(j) = x; — EU(i)/m;; (the dependence on i is only apparent; cf. (2-1). 
Turonzw 1 (Doeblin). Zf EUj(i) « co and a3; « co, then the distribution of 


c oc 
m=1 


tends 
ü V eua iL. P 
s n->00 to a Normal distribution with mean zero and variance 


n n 
v Y.- 7 EU(G)a 
s Lyi} 


eo 


æ qi : 

e=>d O 49 SAI) S "ut Tue Mik oy 
a, ii a P a sa r 
j=0 "hj; j=0,j+i "hjj k=0 Mik 


TOv . 
P hi that the series converge absolutely. 
e expression for o? is due to Chung (1953). Using Doeblin’s theorem, we now prove 
stinct states belonging to the same positive 


Tre p . j i 
EOREM 2. Let i,, ..., i, be any finite sequence of di 
h that 03; « oo, then there is a matrix br 


class 
and suppose there exists a state i in the class suc 


su ; 
ch that the distribution of 
TUM a é n 
N.(t,) ———: «+> N,(i 
a nl) Miis ey = 


Normal distribution with mean zero and covariance matrix $, 


tend, 
s NT 

as n ->o to a multivariate 
in Doeblin’s theorem. 


whi 
E, can be found from the expression for e? 
roof. Chung (1953) has shown that 


" 2 oq 
EU(i)- ma 3 —— 
j=0 m; 
2.1) 
cog? o g, L Mj tMi ik ( 
. . d el 
EUX i) = mu X Z- +i Bi gy Se ie 
joo mj j=0 hj k=0 Mr 
Gt) 


Provi 
vided that the series converge absolutely. 
.., A, is any arbitrary set of 


e LP ES A, %j = 0 for all j+ ty -otr where Aj, - 
numbers, It is clear from (21) that EUN) « co and 
r A 
EU(i) = Mu X —-, (2:2) 
"Then i) = a 2m, 
$ ayo- È Ynn È 
m=1 d : dei " -1 Mizia 


is n 
-à Aa(Nalle)— 22! (2:3) 
a=1 T i. 
r characteristic functions, 


Wee k n il a-g 9. 
lim Fexp i 2-2) * i — gine, (2-4) 


Bec 
ause of Theorem 1 and the continuity theorem fo 


.,A,. But since the A's 


no 


Wh ; 
NS Q = o? is a positive semi-definite quadratic function of A, -- 
arbitrary (2-4) indicates the convergence of the distribution of the vector 
1 qn ne N.G)— = 25 
EID Tas $9 x J Mipir ( " 


to LEE . 
a multivariate Normal distribution. The covariance matrix X is determined by Q. 
.« oo is necessary for the truth of Theorem 2 

one such state i might imply 03; « oo for all 


€ fact that only one state i such that 02; 
istence of 
duction, the first recurrence moment is finite 


j ne the possibility that the exi: 
€ same class. As pointed out in the intro 


288 i Some asymptotic distribution theory for Markov chains 


for all or no states of the same class. The following theorem, proved by Chung (1954), 18 
a generalization of this fact and confirms the truth of the above conjecture: 


THEOREM 3 (Chung). The p-th moment, p > 0, of the recurrence time of i is finite, if and only 
Vf for every j and k belonging to the same class, the p-th moments of the first passage times from 
j to k and from k to j are finite. . 

Let (X, (r)) denote a Markov chain derived from (X, as follows: X,(r) = (i; -+--> t) e 
X, = ip- X, uua = d, for every set of states i, ...,1, States having probability zero o 
occurring will be ignored. 


Lemma 1. If the p-th recurrence moments for the states of {X „} are finite, then the p-th recur- 
rence moments for the states of {X „(r)} are finite. 

Proof. Consider any state (i,, ...,7,) which has positive probability of occurring. Supp 4 
X, =i. Let 7 be a random variable which denotes the number of steps until the rth Xeon, 
rence of i. Since 7 is the sum of r independent and identically distributed random eel 
(the number of steps until one recurrence of i,) each of which has by hypothesis a finite 
pth moment, it follows easily that Er? < co. We shall say the event ¢ is associated with T 
X,(r) = (4. ...,d,) forsomev (r — 1 «v « r); i.e. the event eis associated with a random vati. 
T if during the course of the r recurrences of i the sequence of states i, ..., 2, 18 visited A 
succession. Let 7,, ..., Ty denote N successive such random variables where N is the smalles 
possible integer such that the event e is associated with two of the random variables. The? 
random variables are independently distributed, Ty and one other random variable have for 
their distributions the conditional distribution of 7 given that ¢ is associated with it, and » 
N —2 others have for their distributions the condition distribution of 7 given that € is no 


associated with it. Now, if it can be shown that E(t, - ... - Ty)? < oo, it will follow, since : à 
least two occurrences of the state (à, ..-,¢,) take place during the time 7, +... HTN that p 
pth recurrence moment of the state (i, ...,1,) is finite. To this end let P denote the m 

ability that the event c is associated with r. Since (i,,...,7,) has positive probability 3 
occurring, P>0. If P= 1, then N = 2 with probability 1, and it is easily seen hes: 
E(t, +73)? « oo. Suppose P « 1. Let E(r^ | c) and E(r? | @) denote the conditional expecta 


tions of r? given that ¢ is associated and not associated with 7? respectively. Since 
Er? = E(r? | e) P + E(ro |ē) (1— P) «co 

and Z(r? |€) are both finite. It is clear that 

Pr(N =v) = (v1) PX1—Py2 (vy =2,...), 

i.e. N has a Pascal distribution. Therefore 


ose 


it follows that Er? |e) 


E(t +...47y)? = XE, +...47,)? Pr(N =v) 
< X B(vmax (r,, ...,7,))? (o— 1) P(1— Py? 
= 


veB max (r,, ...,7,))? (o— 1) P — Py? 


< X or? max (E(rP | e), E(r? |) P*(1 — Py? 


« 00. 
This proves the lemma. 


Cyrus DERMAN 289 


Now let i i 
CNN, z E r 
qp ani Tep ee Fangs e$ Tos tirg be k sets of states each having fy, ...,7 
poets fg 


members respectively. Let zi 
= 


B= TI Pinitasy (C= Lok) 
W a=1 
e 7 ; 
qm e now a slight, but useful, generalization of Theorem 2. 
REM 4. Ifin izr, belong t i id 
TUR g to the same class and i 63, « co for i in the sa 
as n> co the distribution of j di i d 


nP, 
m; 


nP; ) (2-6) 


ar : 
zx lg eee P 
Usu 


1 

IN (i : 

ni N (tr +++ tir) - 
no 


lend, 
S to a ART: TC R ; 
a multivariate Normal distribution with mean zero and covariance matrix i to be 


determined, later. 

1 Eon Pu r = max (n, eh): Let As ..., Àp be arbitrary real numbers and associate 

tbllows: quence of states ij t2» +++» "ri (l= 1, ..., k). Let (Y, (7)) be defined from X, (r) as 

Y,r)- XA if X4(r) = Gy apap tay in 

Where S" deudtesl ý , P ES T ae 
he sum over all such possible 1 for a given state of X,,(r); e.g. ifj and i, j 


a 
Are ty 
WO sets of states and A, and A, are associated with j and i, j respectively, then whenever 
o such l, then Y,() = 0. Let 


t 2 (4234 . 

"i ) = (53), Y,(2) = 4,2 If for a given state there is n 

Tinian ij) be any state of {X,(r)}- Define Uis, -... i5) as in Theorem 2. By Lemma 1 the 
ce of the recurrence time of (i, ..., 2j) i$ finite. We have, also, that BUF (ig, ...,3,) < Os 


since [7/5 . k 
Oi, ..., i) « Z X; | A], where Z isthe recurrence time of (i «+2 tr) Using the fact that 
pesi 


lim ln 
DER > for all i, it is easy to see that the mean recurrence time of any state 


no N won = 
( di 
Tou A ; "T maa x 
4 si), which we denote by m(; «++» i,),18 __ iin, Thus, since absolute convergence 
1 easily Piria ZU 
proved, we have from (2:1) 
BU (iq, 9) iz 
m(i, «++ Ùr) PD mlin en trope te? i,) 
id AsDisis = Di, s, aie n 


k A, 


1l 
t4 
t4 


a=1 ¢ Miri 
E Aapa, (2:7) 


a21 TI la 


Vu 


Wh 
ere r, r " 1 
= denotes the sum over all possible sequences ty, +++ r-ra" The argument used in 


Proyin " 
ing Theorem 2 now applies again. This proves the theorem. 
ill be where the 7s are either 1 or 2 


Wome of interest to us below W > 

For in the general notation as long as possible. 

Statistical applications of Theorem 4, it is necessary to be able to determine X, the 
ows that the vector 


ariance matrix of (2:6). From Formula 4 below it foll 
n-M(N, (i, ight) — EN, (i [EE ) EN, (xi sess bo (2:8) 
n. Feller (1949) has shown, assuming the 


anq 
(2-6) have the same limiting distributio 
Tg Biom. 43 


. However, we shall 


” Nas e Uhre 


290 Some asymplotic distribution theory for Markov chains 


finiteness of a second recurrence moment, that the limit of the variances of ee 
7,— co is equal to the variances of the limiting distribution of (2-8). If we can show that à 
same holds for the covariances then it will follow that X can be approximated by com 
puting the covariance matrix of (2-8). 

The following theorem is for this purpose. 


= distribution to 
THEOREM 5. Suppose the sequence of random vectors (X,,Y,) converges in "imer, 
a joint distribution function F(x,y) which has finite second moments. Let (X, 1 ) be a het 
vector with joint-distribution function F(x, y). If lim EX? = EX? and ime Y= , 
no -+0 
lim EX,Y, = EXY. 


n—-o is that 
Proof. Let Z, = X,+¥,,Z=X+Y.A consequence of the Helly—Bray Dec 
EZ? « lim inf EZ; Thisimplies A(X Y) < lim inf E(X, Y, ). NowletZ’, = X,—Y,Z'— 

n> 


no 


: = Y) 
In the same way we get that E(X Y) 2 limsup E(X, Y, ). Therefore lim E(X,Y,)-— Ex 
no n> 


proving the theorem. 


3. COMPUTATION OF THE COVARIANCE MATRIX te 
' ; , À i 
We now commence with the computation of the covariance matrix of (2-8). It hi tion. 
convenient to prove some asymptotic formulae which will be needed in the computa 


In all that follows the second recurrence moment will be assumed finite. 
We state without proof 


n 
—> 0 no 
Ma, 
v 
v=0 


Lemma 2. Ifa, >0, lim b, = b, lim ^. = 0, then 
N- 


‘n n 
lim Y a, ,b, Xa, - b. 
v=0 


no v=0 
Formula 1. 
2 1 M; m4-1 
o) jj dj j.4 
ij — = — 253, = 0, 1522); 
A (x nj 2m; — mg ix i 
where 


M, = Xv(v-1/5. 
Proof. We have directly 


Am 
N 1 N n 1 = j 
D us v n—v) . y=1 

X (vt DET Ca | at 


n=1 4) nel My Ty 
"n ») 
=z SY pi (9 x az) -XÀ my ` 
Feller (1949) has shown that 
3 (9 - a) EE 
£N A M55, 2mj; 


We also have that 


Hence using Lemma 2 we get Formula i 


Cyrus DERMAN 291 


' Lemma 3. i 
Safo- i)o) i262» 


Proof. As in proving Formula 1 we have 


Zn 1 Ja 1 ( E s) 
m) = OY eee) aa |e eee v 
noi nj-; 2d D em D fi (v j =) mg 
v n 
$ (n—v) $ (1-2) 
= ) n-».-—-[.- = 
val fi 2 a(t m | i^ Mii 
n 
( as do) (- à) 
= yf ) —— z n 
- Sin Z (t-a) Em x (0a) 3 5 
(3-2) 
By Lemma 2 and Feller's result the first term on the right of (3:2) is bounded. Now 
7 1 
|X qx 1 wt ilas x 5. 3:3) 
N x APD G ine «Xx uN rod Uo om 
Felle: " H ln y X! | 90 - ing ES 
T (1949) has shown that b | 2 aa «oo. Hence "NV ma 
3-2) is o(N). 


Using Lemma 2 we then see that the second term on the might of ( 
A similar argument, making use of the fact that 5 í 1- b: if w) « co, applies to the third 
t gr 
erm of (3-2). Hence Lemma 3 is proved. 
F ormula 2, 
n-r My, — 2(mi; = 1) mai QT E) (n—k—-1) +0(n) 
: à (E 2m; 2m; 


fo 
rv x * = 0,1,... and any integers 7; k» 0. 


Proof, We: — 
n-rn—v—k 


et n-r v) E 
Zapa- - Secrets "uer x ma 


y= 


1 
xut w- xJ-3 yal (eb (ye + PE ns Z0- V " 


=n} (Pi — m 
Jt, - 20m- ma) 4 (n—k- D o), (3-4) 
es nf 9n; Mit 
y . 
Virtue of Formula 1 and Lemma 3. 
ormula, 3. 
Wa E. —1)m; M,;- Mmi -1)mg 
EE B EUN M; — 2(m;j ppc +0(n) 
a PR Y p= Gsm +n D 2mjma E ims 


igp 77 2mm; 


for 
mij= 0, 1, ... and all integers r, bz 0. 4 


292 Some asymptotic distribution theory for Markov chains 
Proof. We have 


= = akp n-r O), ; 
Xx = "Son" X (Q-) Xe» e 
»-1 =i y-1 t=1 N55, »=1 mj 
Using the fact that 2 p% = — — +o(n), Formula 1, and Lemma 2 on the first term on the 


3. 
right-hand side of (3. 5) and oce 2 on the second term we get that (3:5) yields Formula 
We are now in a position to calculate the moments of N, (i,, ..., i,). 
Formula 4. 


EN, (i, ...,i,) < of" = +2 + Sa} +o(1), " 


d 2m? 


TI 
where P = [J p;;  andlisthe initial state. 
a=1 o m1 


LI 
e 
Proof. Let Y, — 1 if X, =t, Xna = D sy =ù; Y,=0 otherwise. ape 
X, =l we have n—r+1 3:6) 
EN, (iz, -..,4,) = E ZY, = P à pe. a ( 


The result follows by using Formula 1. 
Formula 5 


var N, (is, ..., i) = nP| : + P( en p Mas) o(m) 
T" . iri mi mij 
PODS SE 11 sous EA 
Proof. We have 
" 2 n =I n 
ENG ni) = BEY) = a| È niay, $ xj 
»v=1 y-r 


ver t=v+1 


at 


" : n—r n—v—r4l (3°7) 
= EN, (iy, ... i) + 2P? D oh? X p». 


t-1 
Application of Formulae 3 and 4, together with the fact that 
var Nalin ..., 4) = ENZ (i, ...,4,) — (EN, (is, ..., 4)? 


proves the result. 
Formula 6. 


ew (NAC) NA) = n 2. atu 1 CM Ma cogo. 
Uum; mym; 2ymnm; mima 


Proof. The proof is similar to those of Formulae 4 and 5. 


Formula 7. 
l M , , 
var N,(i,%) = Pafi " + 8) com 
Tug si Me Mi; i 


Proof. The proof is as in the 
for t in (3-7) is O instead of 1. 


Applying the well-known techniques of asymptotic distribution theory we get gens 


lim 
proof of Formula 5 allowing for the fact that the lower 


var yn A1. P van N, iC; jp var N,(i) 2p, 7. tt cov (N, a(i), Nai j)) 
N, nt 2 N,(k,1 <M, P 
-— (^^ E "70 ) ^ MAH (oov (Ny (i,j), N, (c) — pr cov (N, (5,7) Nal) 


3.8) 
7 Pi COV (N,Q), N, (b, D)) + Davy cov (Nl), N(R) 


isa gi : 
given matrix p,; on the basis of Xi; -o An 


Cyrus DERMAN 293 
Using thé above formula and (3:9) and after carrying out the computations necessary to 
evaluate 
cov (N, (i,j), Na(k.1)) and cov (Mali); IN. 9) 


f i : : 
or the various special cases we get the following results: 


SPACE) Nj) p Malk: l 
var yn Way PaA —pu) Cov (x d 2—mapgba d i= k, 
0 if ick. 
(3:9) 


The analogy of (3-9) with the second moments of a multinomial distribution, with mj; 


playing the role of the number of trials starting from i, is clear. 


4. APPLICATION TO A GOODNESS OF FIT TEST 


Consider the problem of testing the hypothesis that the matrix of transition probabilities 
X. where X; ( = l; -> n) are observed states of 


clear that a finite number of observations will not 
Provide estimates of all the transition probabilities. This suggests choosing à finite subset 
9f the transition probabilities independently of the data and testing the hypothesis 
that they are the true transition probabilities, ie. testing, only partially, the original 
tyDothesis A rejection of the partial hypothesis would, of course, imply a rejection of 
he original hypothesis. The question as to how to select the subset of transition 
Probabilities will not be considered. Thus, as far as this paper is concerned, the selection is 


t : 
he Markov chain in the first n steps. It is 


arbitrar 
y. 
Let ; 
a ING) | nÜ0u-P9). R(0()-1- E Pi 
2:8 NO y yr Jh NI (mapa) ' «09 wee 


and 4,(C;) = Vm RAC f 


RAC) = 1- Eu 
j«Ci 
own using (3:9) that 


depending on i. It can be sh 


w 
here C; denotes a finite class of states 
cov (Zip Za) ~ 7 N(PuPa) (L3); 


var (Zi) * (1 — Pij) 
cov (Zip Ze) ~ 0 (Eb) 


Za, Z0) ~ Aou FAC) 
.z40)-9 0*9. 


cov (Zij 
Į var (Z0) ~ 1-20) 097 (ZC) ZC) ~9 CEP. 
LI . B 
^ follows from well-known. asymptotic distribution theory using Theorem 4, that, if (pi) 
5 the true matrix of transition probabilities, the limiting distribution of any finite set of 
Sand Z(Csi 9 ee ate Nor 5 and covariance matrix given by 
^ Let z : 8 os. acp f À th a multivariate normal 
Nd | (5e Z0 C= b i : 
istribution with cs pH and cov 4-1). Then it follows (see 
amér 1946 3 
R ,p.41 t k " " 
p. 419) tha ye 5 | xz ze) 
£ jeCi 


cov ( (4:1) 


., k) be ran 
ariance matrix given by ( 


j=l 
^ a Y? distribution with Sd degrees of freedom where d; denotes the number of states 
ion Wl j; AC, 
i=1 


294. Some asymptotic distribution theory for Markov chains 


in Cj. Since the corresponding Z;;'s and Z,(C,)’s have such a limiting distribution, it also 
follows (see Cramér, 1946, p. 314) that the limiting distribution of 


k 
#=- 313 zie zu] 


i=1 ljeCi 
is that of y'?. Thus the x? statistic supplies a y? test for the goodness of fit of a given finite 
set of transition probabilities. The m,,’s in x? can, in principle, be derived from {Pi} In 
most cases this will not be feasible. The limiting distribution of y? will not be changed if 
the m;;'s are replaced by their estimates obtained from the recurrence times. 


I wish to thank Profs. T. W. Anderson and D. A. Darling for several helpful conversations 
held while this work was in progress. 


REFERENCES 
ANDERSON, T. W. & Goopman, L. (1956). Statistical inference in Markov chains. To appear in Ans 
Math. Statist. 


BanrLETT, M. S. (1951). The frequency goodness of fit test for probability chains. Proc. Camb. Phil. 
Soc. 47, 86-95. 


Cuune, K. L. (1953). Contributions to the theory of Markov chains. I. J. Res. Nat. Bur. Stand. 50, 
203-8. 

Causa, K. L. (1954). Contributions to the theory of Markov chains. II. Trans. Amer. Math. Soc. 
76, 397—419. 


CRAMÉR, H. (1946). Mathematical Methods of Statistics. Princeton University Press. ples- 

DorsLIN, W. (1938). Sur deux problèmes de M. Kolmogoroff concernant les chaines dénombra 
Bull. Soc. math. Fr. 66, 210-20. 

FELLER, W. (1949). Fluctuation theory of recurrent events. Trans. Amer. Math. Soc. 67, Lr. 

Ferter, W. (1950). An Introduction to Probability Theory and its Applications. New York: 
Wiley and Sons. dlich 

KorMocorov, A. N. (1936). Anfangsgründe der Theorie der Markoffschen Ketten mit unen! 1 
vielen möglichen Zuständen. Mathematiceskii Sbornik, N.S., 1, 607-10. 


[ 295 ] 


A GENERAL METHOD FOR APPROXIMATING TO THE 
DISTRIBUTION OF LIKELIHOOD RATIO CRITERIA 


By D. N. LAWLEY 
University of Edinburgh 


1. INTRODUCTION 


In the theory of testing statistical hypotheses it is well known that —2logA, where A is 
the likelihood criterion of Neyman & Pearson (1928), is distributed for large samples approxi- 
mately as y2. This was proved by Wilks (1938). Box (1949) has shown that for a certain class 
of such criteria it is possible to improve on this approximation in various ways, so that in 


cases where the exact distribution of the criterion is unknown good approximations based 


on a knowledge of the moments of the criterion can be developed even for samples of 
ing —2logAbya scale factor 


M une size. The simplest improvement consists in multipl by 
ich results in a statistic having the same moments as y? ignoring quantities of order n~”, 
where 7 is the size of the sample. This scaling device was first used by Bartlett (1937; see 
also his recent note, 1954). The object of this paper is to show that for any likelihood func- 
tion Satisfying certain very general conditions an improved x? test of this typeis, in theory, 
Possible, 
2. AN EXPANSION FOR Lo 

hose logarithm will be denoted by L, depends 
“Pon p +q population parameters Oy, lo +++ Opie assumed functionally independent. We 

i ect to the 0's satisfy some uniform 


t to the 6’s to commute with 


Continuit a : à 
s condit: hich allows differ 1 
: uh that the second derivatives 


axtegration over the sample space. We shall further assume i 
L130} are of order n, where ? is related to the number of observations. These assumptions 


TODI usually be satisfied in practice. Let 69 denote the true value of 0,. We shall be con- 
ED with testing the ‘composite’ hypothesis H, that 0p +wp E en psa have specified 
pene, which if H, is true we can take to be 60 ,, Op +2 Ope while 05, Day ..., Ôp are 
"specified and unknown *nuisance' parameters. The criterion obtained by taking minus 
Wice the log likelihood ratio will be written as a(Le*9 — LP), where L® denotes the result 
Maximizing L with respect to A, uting true values for the remaining 


0, 9 Oi; and substit' 
Plümeters. 
pe developed by Ba 
all make full use of eed Te ER UT n for L® which we now obtain follows 


On a 
Pproximate confidence interv@ 
Very closely § 8 of the second of these papers (in future referred to as IT), though our notation 
s of higher order. We shall use the notation 


1s di: 
L,,— 9/00,00,00,  eto., 


rtlett (1953 a, b, 1955) in three papers 


fferent and we have taken in term 


= Ld Lrs = 32L 20,0035 
Avs == B( Lys)» Arst = E(Lyst)> etc., 
Ls = L,s— Avs» La = Ly Arst etc. 


i 
The ; d : 3 
; ndom variates of order Jn with zero 
n all A’s are, in general, of order ? and the l's are ra J 


ex 
Pectations. 


296 Distribution of likelihood ratio criteria 


Let ô, 6, UN [A be the estimates inserted for 0,, 0,, ..., 0, in L to produce L®. The equa- 
tions for determining the 7, may be written 
0 = LL Luz m pastus. (r=1, 2, ..., k), (1) 


wherez, = ô, —0,, and wherehereand elsewhere, unless otherwise stated, the usual simaan - 
convention is employed. All suffices run from 1 to / The inverse expansion of a, in terms © 
the l, is found, with a little algebra, to be 


2 
t= —Lri, = Tayl, l, re BO rs, tu ll, ly ix [m l ll, Tes ( 
where aq = DAD DÜL 

belii = TETTRE pg rois 


= ATT aj 
Crstu = Lp. ADSL vis, 


and where [75] is the inverse matrix of a: 
We next write 


Le = L+ La, + Luxus, ae [IN EET + Pa Lrstu piot ty Tes 
which in view of (1), is equivalent to 


Kk) — " 1 
L-LM = Lrt, t F GL, am, Egt LE Laut, QU, uuu. 


Substituting for the x, in this, using (2), we have 
2(10 —L) oak Lr, 1, Saul, lil, in 15, ull, l, la + 


is ritu 1L, Us esas 
Hence, including terms up to the fourth degree in the /'s, 


ULL) = Alls — 3o, lll AAP Lely — 35, ul ll 
syl lll, Anto ll, ll, — FAV APMED TY Liv 
ARON, DL Dongs 8) 
where 


Cpe = NOM DAG, 
Brot = AAA EA A sy Ais 
Yrstu = NMI hij 
and where [As] is the inverse matri 


n pias 
x of [A,,]. We have here taken the expansion as fa 
terms of order n-1, As an explanati 


on of the way in which (3) is obtained we note that 


Ds = As ANSU + ArvAsvAtwT 


tias ang 


and that 
Gul, l,l, = RENE ay F Diw) lll, = BAP NIAM NED pi LU, lu dee 
= Gal. lil, + AP ARAB] ELT ui —3Ar'a ll llus see. 
3. VARIOUS EXPECTATIONS 


P ined 
We now require the expectations of various products of the l’s. These are readily obtain 


by the method given in § 2 of II. We shall suppose in future that H, is true and that apps 
takes its true value 09. Using the notation 


(Ars); am 9A, 00, Qua), zs 9A, ul 90, 
Ous)n, = 9?A,,/00,00,, Krst) = E(L,l,), 


D. N. LAWwLEY 297 
we find that BLL) = —Ajss 


Ellly) = — Aa Ages 
Kil, lau) ES Jost a Qu): 
El, FE vst -X Qi 

(3) 


EL) = Aystu ns (Agu)r m (su) T (Atu)rs— K(rg i» 
Kil, A lla) =5 (usui) v 3r +25 Usu FM Qui X Keato: 
(3) (4) (6) (3) 


W. 
here, for example, X OAE Arh + Orde t Ole 


Ot " * , 
her expectations, in which we neglect terms of order n, are given by 


E(LLIL,) = E {Apu Awl Ase (> s, t permuted), 
(3) 
El, l, los) => (Ano F^ Quee) As (r, 8, t permuted), 
(3) 


Eu, ls liu loj) = Anu ES (Au) m EE Qus F fA ew m1 Qui Ua [^ Qu)3 gk Aa Ktw lvw) 


a making use of these results we find the expectation of 2(L"—L), as given by (3), to 
+€,-+O(n-2), where ej is of order n~ and is given by 


ne ASAP, E (Aystu F (Aris a DMA EA te Asuw zx Yun sow 
H — Apto su)u — Artu(Asw)o+ (Ane (aout Qus Asw)o}+ (4) 
a o k. The expectation of the criterion 


Lo as before, all suffices are summed over the values 1 t 


ae 
® — I) may be written as — 
q-- 6544 Ep * OM). 


^s order to simplify the ensuing algebra we shall now assume, which can be done without 


o 
5 E of Senerality, that the parameters were originally chosen such that, for all unequal 
"Mid Avs = 0 = ds (when the 0's are given their true values). We can then write, discarding 


em 
Orari d ; 
Porarily the summation convention, 


k k 
"o QE ARS S (Aus) + Quis Are As) F 5 x BY T Vass Ant 
—Ays(Arsh— Asst (Ags) os Ars)s (Awdl(—AmwAssAw)» (9) 


thout loss of generality, of putting, for 
). This enables us to write, in place of (3), 


The 
all, further simplification will be made, again WI 


An m —n, Arr = a1 (for true values of the 6’s 


1 
; 1 1 l Ae hh 
2(L)_ pj = Lilt gpp bhhta Lut ig Artu "llla 


1 1 l " 

ub ER Ay Anvlelstile zu má Aysulrlslh liu 4E 3i lrlliln + n3 Ll llus (6) 
Lie ee aeo 

Eno; ; : à 
abe ne terms of more than the fourth degree in the ls. A few words of explanation are desir- 
Th * this point. While the notation As, 18 unambiguous, the meaning a Me depends m 
We “ral, on the order kof the matrix which is being inverted. Nevertheless, if at the outset 
An S 90se the Ü's such that A, = —n and Ars = 9 (r+8) for 1,8=1,2,....P +d es 
5173 and Jr = 0 indes? the value of k. We can, for example, make the choice of 

vw re 


298 Distribution of likelihood ratio criteria 


6’s by means of a certain linear transformation. If the original set of 6’s do not satisfy the 


above conditions we may replace each 0, by 0; = X,a,,0,, with suitable values of the 4s: 
sr 


By this means we obtain a set of 6’s which makes (6) true for all k. These remarks are relevant 
to the definition of the quantities m? in the next section. 


4. JOINT CUMULANTS OF THE 71, 
Now define quantities mj, m3, ..., m2. by 


mi = 2(L"—L), 
m? = (L0 Le-d) (r>1). 


Uk 
Then 2(L99— L) = > mê, 
r=1 


+ , 
and the criterion in which we are interested may be written as 5 m2, It is readily found 
r=p+1 » 
that m, is given (with suitable choice of sign) as far as terms of the second degree in the /'s by 


l 1 1 1 7 
™, An = b teal T gh E Amdt gap, Y Alt ge LL, + "2 Ll ( ) 
where we drop the summation convention, and continue to do so until further notice- 

We now establish certain results concerning the joint moments and cumulants of the rA 
We shall denote these by 4,, flys Mrs) ... (with or without dashes) and by Kj Kre Krst "77" of 
our notation /t,, = k,, is, for example, the covariance between m, and m,. For the mean 
m, we have 


, 1 l il -1),. 
Kn 3 pen = Gn er t ga E Ars top Amt Q2 x {— At (Ars) + Ol” 


jos He = n Bore + Berd} n7 X C7 uc Aral} + Ol). 
ser 
M A k es 15 
Since the expectation of Y, mz is k4- 6, 4- O(n72), where €, is given by (5) with Ay = $ 
r=1 


à eed 
we have ji, = 1--6,—€, ,--O(n-2). It is unnecessary for us to evaluate jij, ("+ 8). wee 


only observe that both 5, and Lrs = K,, are of order n-, 


; «on of 
For the third-order moments and cumulants we have, first, that n3/,, is the expectation 


ij 3 3 3 3 

3 

Ig enl gale Xie E lees er PX Lt T 
and thus —— 5, = nH An + BAr) AH NE X ( Ara + 0,,,) + O(n-F) 

scr 
= 3u, + O(n). 
Similarly we find lirs = ha +0) (r48), 
I4 = O(n-3) (r,s, t all unequal). 


Hence, for all values of r, s and t, Ky = ftp = O(n). 2, The 


We must next show that the fourth-order cumulants K,,,, ANA &,.., are of order" ' pra 


method employed for doing this involves exceedingly complicated and laborious alge ts 
and has the disadvantage of making the final result appear miraculous! We hav 


D. N. LAwLEY 299 


k 
however, been able to discover a better method. We shall find the covariance of Y, mj, as 
rum 


. £ 
given by (6), with Y; m2(r > k), asimilar expression in which, however, all suffices are summed 


ja 
Over the values 1 to r. We must retain all terms of order 2-!. It is hardly desirable to re- 
Produce all the algebra, but it may help to give a few illustrations. We reintroduce the sum- 
mation convention with the understanding that the suffices g, h, i and j are to be summed 
from 1 to k, while the suffices s, t and u are summed from 1 tor ( > k). Four ofthe terms which 


arise are: 


l 
nè goy: (1). (I; lj 


= 2k E (— 3A uu 405i 405i); — Qui — Ouid — Asisi + Koo 26060); 


1 
Bnd Noni cov {(ls lh), ly A 1;)} 


1 1 h = ; 
= ai Aonik lA * zs Aus As; 7 208). = 0:93» 


l1 
mà ^n; COV ((11,), (1,1115) 


8 4 mE 
= qi Anil Agni t Qi) + as oil — Anni + (Anadis 


1 
3a5 Astu cov {ll lu), (l; Lll) 
2 
; 2 (At Adta- At Aish 
je isa Anit Oud] zal Aj Qu qs el std j/s 
There àre altogether nineteen such terms. Their sum is found to reduce to 
l ; " 
2h E ug — 40; 4j) + zs Uo Ao + Aggi Anni 
da A ^D. 
—4Ngni(Agida— ANggi(Anala + 4(Agedn Qaa 4050; (An hj 
is 2k + 4e, +0(n-?). From this it is 


k 
H r 2 k 
e 2 mè (rz k) 
nee the covariance between à ms and P ire 
= 


ĉasi]: 

yd 
educed that (nt) = 2+ 4(e,— E1) + O(n), 
r 


cov (m2, ms) = O(n?) (r+s), 
i +O(n~); Krrrr = O(n); 
= O(n) (r+s). 


Here = Syr 
+ O(n-*), Kyrss 
that all other fourth-order cumulants 
mulants of higher order we need only 
$), those of the sixth order not 


var 


and that 


8 Hirss = Hirss 

Ne 

B the leading term of m, is lyn, it is easy to d 

tem, tainly O(n-) or less. Similarly, ned bes 
2 > T 

Nor "K that those of the fifth order are no 


5 than O(n-?), and so on. 


300 Distribution of likelihood ratio criteria 


5. APPROXIMATE DISTRIBUTION OF THE CRITERION 


= c6 and 
Now consider the expectation of TL (mi^), where the p; are non-negative integers, 3 
sp "ing in mini 
suppose that it is expressed in terms of the joint eumulants of the m;. Then, bear ing a RE 
the orders of magnitude of these cumulants, it is clear that the only terms which ma 
s (2p; ! dlv, those in which 
first, that in [T («2), which has a coefficient of II (ra. , and, secondly, 

2 . : * ile i t , r terms 
one of the — «fis replaced by k£i7!«2. with a coefficient p; times as large. All ge b 
are O(n-?) or less. Since xi; = K; +x, it follows that, neglecting quantities of or 
the expectation of TT (mt) is given by 

i 
(2p)! 
Tee 
1 


2rip,! 


wg). 
— ts of the 
In view of this it is clear that, to the same order of approximation, the moments 
pra 
criterion $ m3 are the same as those of 
i=p+1 
Opi Fy i224 0 + ces ee: 
: h 
where a; = uj = 1+ 6; — 6j.,, and where z,, 75; --., 2, are independent y? variates eee 
one degree of freedom. So the pth cumulant of the criterion (assuming n large € 
with p) is n R 
2ep—1)! X af = 2-A p— 1)! fg +P(Epiq—p)} + O(m) 
i=p+1 


P 
= 207p — 1)!g h + (Ep+a— ») *0(n*). 
Hence, finally, either 2q(L*0 — L) (q+ €p4q — Ep) 
or zt ES ; (654,— en) (L@+0 — Lo») E 
rder ? 
jd not 9 
re large 
e, particularly if the number of unknown parameters We 


mg 
by evaluative 
6 pq Cp hod was 


tent 


has the same moments as x 

Though the quantity e, is 
arule be of much practical us 
In particular cases it woul 
directly the expectation o: 


: a 
adopted in a previous paper (Lawley, 1956) when testing hypotheses regarding a 
roots of a covariance matrix. It can be shown that both the hypotheses and the gener? 
employed to test them were of the sa: š f stimate 
ĉn+4— €p Will be a function of the unknown parameters, but the substitution of € 
values for these will clearly not affect the order of the approximation. t 
We end with three examples. In both of the first two the criterion belongs to 


is to provide? 
considered by Box, and the results are not new, but our main object here is to pro 
slight verification of expression (5). 


Example I. Suppose that we h 
We consider the well-known pr 
specified value, while the varian 


he class 
ome 


ulatioP- 
ave a random sample of size n from a normal r as 9 
oblem of testing the hypothesis that the mean 
ce 0, is unknown. In this case 


Re — log 0, — 57 in($— 6*4 (n= 1) 8%, 
1 


D. N. LAWLEY 


where z i i 
e is the sample mean and s? is the usual estimate of 0,. The criterion is 


301 


v—1 


2(L®— L9) = nlog ( t E ). 
qu t = @—O,) ns. 
ifferentiating L we find, omitting various zero quantities, 


n n 
An = ~ 36 Ave BA Aw = 0, 


n 2n 
A122 Ex à = (Ass): Ane = TE = (a3) 
Since e sees 


1 
La 


tight vat 0, we are able to use (5) and hence obtain 1+¢,—¢, = 1+3/(2n), which is the 
ue for the expectation of the criterion. The corrected criterion is therefore 


: Q8 
xb = (n- Blog (1). 


Whic 2 244 

h may be expanded as eG =.) 
2v 2y* 

Where tan 

even forn = 

nown to be 


—1. This happens to give the 5 and 1% 


significance levels fairly accurately 
6, and it agrees, as far as the term in v-1, with the correct expansion, which is 


2 2 4 
s, SE, EPEE y 
2y 24y? 


This may b ] s : 
3 Y be obtained by inverting the expansion given by Fisher (1941 
È y 


bw m^ II. Asa second example, suppose that we have 
Parent expectations c? and degrees of freedom n; (i— 1, 
Populations) we may take L to be 


). 
independent variance estimates 
2, ..., k). Then (assuming normal 


1 
m 
i 


LM 


nillog (09) +89/03}- 


€ as 
t. ae that the n; are all of the sam 


e order of magnitude and that quantities of order 
terion be ignored. The hypothesis Ho to be tested is that o? =o} =... = oj, and our 
‘or TN 2(L — L®), where L® and L® have to be determined. 
it is convenient to define the parameters 0; by 


" b= ci. 0; = 0-0 (i 1). 
hen 


> assuming H, to be true, the true values of 0x, ..., 9; are all zero, and L becomes 
— 3n(log 0, 4- 5*/01), 
Where 


k 
n-2Xn, 8&= (n;83)/n. 
; i 


AXiry;,. 4 
mizing this with respect to 04, we obtain 


Lo = —4nflog ($°) + 1}, 6, = 8. 


302 Distribution of likelihood. ratio criteria 


Differentiation of L with respect to 0, gives 


n 2n n | 
An = ~ 363 Am = op [Um s op | 
k 


9n 6n 3n 
Aun 01 ? (A314 01 > (3) <4 01 i: 


Hence use of (5) gives the expectation of 2(L® — L) as 1+6, = 1 + 1/(3n). i 
Since L is the result of maximizing L with respect to all the parameters it is in this case 
more convenient to define 0; as 67, for all i. We then have 


L^ = —} X mlog (st) - 3o, 
i 


2n, 
and also Ar = —39» Au; = » , ete. 
2 i 


Hence, from (5), the expectation of 2(7/9 — L) is 


k+e, = ex. 
v 


ü 


Thus the criterion, which is nlog(s?)— Y n; log (s), 
i 


has an expectation of (k— 1) +5 (= E 
i ni 


*) . The 5? test is improved by using 


1 A d 
3G ok) 

as a divisor for the criterion, as first established by Bartlett (1937). 03) 

Example III. Lastly we consider the hypothesis that the correlation coefficient P bt. 
in a bivariate normal distribution has a specified value, the two standard deviation’ r | 
and 0, being nuisance parameters. This example was considered by Bartlett in the t dl 
of his papers, already referred. to, on approximate confidence intervals. We shall take 
approximation a further stage. If y, 5, and s, are the usual estimates of p, 0, and 7 
tively, obtained from a sample of size » +1, then, omitting a constant, we have 


peo" 


L = —nlog (6,6, (1 — p? xit a). 
g (6,0, 40 — p?) 20—p95|0? 0,0, OB 
Our criterion is easily found to be given by 


2(L® — LƏ) = neg: D ix). 


3 . satin 
dingly laborious to calculate the expectation by differen of 
more directly. Expansion of the above expression in pom 


In this case it would be excee 
L, and we therefore find it 
t = (r—p)|(1 — p?) gives 


mlog (1 +2°+ 2px + (1-- 3p2) 24.1...) = n(a?.9pa34- 3(14- 6p?) at ..-}- de 
Making use of results obtained by Hotelling (1953) we have, neglecting quantities of oF 
n, 


l 289p? 
Beet) = pR pus = 2 ary 3 


n 4n?’ 


D. N. LAWLEY 303 


Hence the i 
required expectation is 1+ (6—p* i 
m. p®)/(4n) + O(n). Fo iteri 
a place of n, the multiplying factor n— (6—p*)- aoe 
may expand in powers of z — C instead of x, where 


1 
a= blogs, ¢ = flog —. 


To t 
he usual order of approximation we then have that 
(n—1(6— 09) (6— O^ - 46— 05 


is distri ccs 
ibuted as 4? with one degree of freedom; and it may easily be verified that the variate 


=g- 
C) — 4 (z — C) has cumulants given by 


1 1 
Ki £ +O(n), Ke= gtm (3—p?) + O(n), 
Thi Ky = O(n), Ky = O(n). 
S CO i 
uld be used to provide confidence limits for p, but no practical improvement on the 


Ordina; 
ry use of z for this purpose would be achieved. 


My € 
hanks are due to Mr D. V. Lindley for his helpful suggestions for clarifying the 


argu s 
Ment in various places. 


B ise REFERENCES 
ETT, M 
2 TLETD D S. (1937). Properties of sufficiency and statistical tests. Proc. Roy. Soc. A, 
ARTLEETT. M. 8. (1953a). Approximate confidence intervals. I. Biometrika, 40, 12. 

ETT, M. S, (19530). Approximate confidence intervals. Il. More than one unkn 


^ometrika, 40, 306. 
te on the multiplying factors in various X* approximations. J. R. 


TLE 5 
“Stam M. S. (1954). A no 
ios qe te qnm 
ox, G, i S. (1955). Approximate confidence intervals. III. A bias correction. Biometrika, 42, 201. 
P 36, 317 P. (1949). A general distribution theory for a class of likelihood criteria. Biometrika, 
S: e 
R, 
Ton E p (1941). The asymptotic appr 
Tni. cance. Ann. Eugen., Lond., 11, 141. 
Te oc. B Eis oe New light on the correlation coefficient and its transforms. J. R. Stai 
Wr » 15, 193. 
N BD. N. (1956). ‘Tests of significance for the latent roots of covariance 
Erman a 43, 128. 
Mus coer eae S. (1928). 
Tug, S ria inference. Bion 
hypoth - (1938). The large-samp 
eses. Ann. Math. Statist. 9, 60. 


160, 268. 


own parameter. 


oach to Behrens's integral, with further tables for the d test 


tist. 


and correlation matrices. 


On the use and interpretation of certain test criteria for pur- 


retrika, 20A, 175 and 263. ' y y 
le distribution of the likelihood ratio for testing composite 


[ 304 ] 


ON THE ACCURACY OF WEIGHTED MEANS AND RATIOS 


Bv G. S. JAMES 
University of Leeds 


1. INTRODUCTION 


sys ived 
The basic problem we consider is as follows. Suppose a5, ..., oj. are k pern "- si 
from observations, which are independently and normally distributed about 3 Ne. 
mean x, but with possibly different variances A481, ..., Aoi. The positive e Ee in de- 
known. The 2? are unknown, but estimates 5? are available which are distri pee E 
pendently of each other and of the x; in the usual mean-square forms with Ke mA 
freedom (v; known). We require to find confidence limits for 4. For example, the a. sait 
the means of k samples (sizes 7; -+-,%,) drawn from normal populations having aad 
mean x, but perhaps different variances 0%, ..., oj. The s? are then the sample v: Again, 
defined with degrees of freedom V; —;— l, and the À; are the reciprocals of the n; Mm 
the x; might be the estimated slopes of I: regression lines, the true lines being m che 
the same slope x, but the residual variances 71 being possibly unequal. The s? are re the 
usual estimates derived from the deviations from the fitted lines, while the A; & 
reciprocals of the sums of squares of deviations of the independent variate. T 
Let w; denote 1/(A;o2), the reciprocal-variance or weight of ;, and let w; denote P eben 
Write w for X»; and w for Ew,. Now if the true weights w; (or their ratios) were kow 
the quantity z— Xo,v;[o would provide an estimate of / of maximal accuracy dade true 
being w-1). Tt therefore seems reasonable, in the absence of any firm knowledge © 


= tandar 
weights, to use the estimate 2 — Xw;z,winits place. Now although w4(% — jx) has a 5 
normal distribution, that of wi wi(e 


; e Wir 
— t) has a form which depends on the ratios a no 
which are unknown. Thus tables of this distribution, even if available, would be 
assistance in finding confidence limits for ^. Nevertheless, it may be poasible h that 
function u(r,, ...,7;), or u(r) for short, of the ratios r; = w,/w (Zr; = 1), which is sue 


(1:1) 
Pr[|w|«u(r)]- P, «nd 
s nfide 
either exactly or approximately, where P is the required confidence coefficient. Co n the 
limits for # are then ĉu 


: sg do 
(r)/wt. Of course u(r) will depend in addition on P a (1-1) does 
mpto" 
tion 


degrees of freedom Vj. Itisn 
or does not possess an exact s 
tically correct when all the 


ot known for certain whether the functional equation e 
olution, but in this paper we present a solution which rine 
degrees of freedom are large. More specifically we give 


ie a”) 

u(r) for which Pri] u| <u(r)]=P+0(-) 

and tabulate it for the case k=32. »propri ate 
The ordinary large-sample approximation would be to take u(r) equal to the app tisfie 

point of the standard normal d 


: " t sa 
istribution. This is also asymptotically correct, bu 
(1:2) with v1 in place of y-4, 


by 
3 z work 
We also show how the tables of this paper, and certain others originating vb ar 
B. L. Welch, may be used to provide confidence limits for certain quantities Y 
estimated as ratios. 


| 


| 


G. S. JAMES 305 


2. GEN. = 
ENERAL DISCUSSION: YATES'S SOLUTION 


z. 


Ina sense " 
& Welch ue pues problem is dual to that discussed by Welch (19474, b) and by Tricket 
ines eom Rome also further references below). Their problem was that of finding c i 
(not here Pisis sho previously specified linear combination, 7 —Xf,j;, of k ee nal 
cular, an Mili dee the assumptions being otherwise the same as before. (In ai 
Populations € its were desired for the difference between the means of two normal 
t= Yq, -xA à "ad variances were not assumed to be equal.) More precisely, if y —2:f;v; 
Problem w 2A, 8? (the estimated variance of y), c; — a;[a (Ze; — 1) and v —a-l(y — 5), then the 
as to find a function v(c;, ... c) 2 v(c) which was such that ' i 


Pr[|»|<v(o)]=P. (2-1) 


Funct; 

ion ; : 

"M md have been given which satisfy this equation within terms of order »-? 
ave Bins en 2 and within terms of order »-5 (Aspin, 1948). Tables (for the case k= 2) 

er i > Trick " pa 
artley, V. y Aspin (1949) and by Trickett, Welch & James (1956) (see also Pearson & 
OW a di f 

Was pw iram. solution to the problem of finding limits for the difference of two means 
ates (19 eee by Behrens (1929). This has been rederived by Fisher (1935, 1939, 1941) and 
effreys € using the theory of fiducial distributions of population parameters, and by 
een giver € using an inverse probability approach. Tables (again for the case k= 2) have 

argument, y Sukhatme (1938) and by Fisher (1941) (see also Fisher & Yates, 1953). The 

using fiducial distributions is as follows. Let 


Th t= (ar, —m)[(038) = Aes -rat (2:2) 
en t ; 
the t; have independent Student distributions with v; degrees of freedom. Equation 


(2.9 


18 equivalent to 
(2:3) 


and e ji m2, Gt Bor 

yay Xalt;- (2:4) 
, (2:3) gives the distribution (in 
denotes the value of | t |, based 
ded with probability (1— P), then the fiducial 
+ ajt;p|f; 18 numerically equal to P. 
d from that of t; using (2:3) and 
ransformed (-distribution. Likewise the 
les, is said to be that of the linear com- 
as constants. Now 


values x; and sẹ 


tig 
now 
Y asserted that given a unique set of 
is to say, If tp 


; ey sense) of the parameters z; That 
p obabitin, s of freedom, which is only excee 
T is x" that x; lies between the (fixed) limits 2; 
; say the fiducial distribution of ji; is obtaine 


e unique samp 


bi " "Ms ear of 7, given th 

t-variates given by (2-4). with y and the a; treated 

Thus if v—a-À(y 7) -Edlt- (2-8) 
d — d(c) denotes that number which is such that (for constant ¢;) 

pei] Zeit: | «d |=? (2-6) 


(2:7) 


A|-y|«d(o)- P. 
Biom. 43 


n it ; 
s IS asserted that Fp[|»| 7^ 


306 Accuracy of weighted means and ratios 
where ‘Fp’ denotes the fiducial probability} of the relation following. This is a fiducial 


statement about 7) given a unique set of values x; and sẹ. It is nof asserted, and it is not 
true in general, that Pr{}v|=a-t|y—9 | «d()] =P. (2:8) 
where this statement is interpreted in the ordinary way (7 a constant; y, @ and the c; hava $ 
their ordinary direct probability distributions). It is held by those who use the -— 
argument that the fact that (2-8) is untrue is completely irrelevant, and that when, E 4 
many other procedures based on normal distributions, statements equivalent to (2:7) at 
(2-8) are both true (each with its own interpretation), this is a confusing accident. ‘it 
However, this does not seem to be the place to argue about the two points of view. ; j 
purpose in mentioning the fiducial argument is to recall that Yates (1939) showed pe 1a 
leads to a solution of the weighted mean problem which can be applied using the same table 


f bants, 
as are needed for the Behrens-Fisher problem. For if the r; can be regarded as const 


2:9 
then wu wi(e—y)- rit; e 


T 
has the same distribution as v of (2-5), except that the quantities c; are replaced by the "i 


Thus (2:10) 
Fp[|u|s2wt|p —2 | «d(r)] 7 P. 
This equation gives limits, in Fisher's fiducial sense, for the parameter 4t- The dually 
between the original Behrens—Fisher problem and the weighted mean problem with E E. 
is particularly striking, because with /, and /, — +1 we have r,—c, and rs—6r B. 
Sukhatme—Fisher tables, where the quantity 0 —sin-!c] is used as an argument 1n* 
of c=c,, the change merely amounts to using (17 — 0) in place of 0. les of 
If confidence limits in the ordinary sense of equation (1:1) are required, the tab 
Welch’s »(c) cannot be adapted in a similar manner, and completely new ones, giV the 
are necessary. Even with the two sets of tables available the duality (if indeed that id e 
correct word) is very one-sided; for the u-problem is fundamentally more complie 
than Welch's v-problem. This is because in the latter the combination occurring d the 
numerator is £2, -- f/2;, the //; being constants, whereas in the former it is W, 21 + ^n s 
w; being random variables. In fiducial theory the w; are also (at a certain stage) bos: 
constants, thus leading to an almost complete duality between the two problems. 


3. THE ASYMPTOTIC SOLUTION 


, gh the 
We suppose that we have quantities 2,,...,2,, which are normally distributed a that 


same mean y but with possibly different variances o — 1[o, ..., et, — L0 degre” 
a= Vw, ..., 4, = lwp are estimates of the «;, ,a,/a, being distributed as x? with + po ed 
of freedom, the v; being known positive integers. The x; and a, distributions are SUP ae 


:quo Dt 
TET am assuming that an expression such as the left-hand side of (2-7) is capable of & und. tribi 
pretation. Fisher has laid great emphasis on the theoretical necessity of defining the fiducia. j intl? 
tions of parameters in a unique fashion, but even when one only makes use of sample statist” os 
sufficient for the parameters, it sometimes seems to be possible to produce more than po 
distribution for the same quantity. The interested reader may consult Creasy (1954) (ues d 
Discussion held at the end of the Symposium), and Mauldon (1955). 2 D 


conditions laid down by Fisher for the derivation of fiducial distributions; but in practice 
neglected for the time being. 


G. S. JAMES 307 


al à 

NC oe independent. No prior knowledge of x or the œ; is assumed, and it is 

E Vu e ln qu limits for x with confidence coefficient (asymptotically) equal 

E = Sw, a,/w, where w denotes Xiw;. Then the problem will be solved if we can find 
n U(t, ..., 5) = u(a) of the sample variances (and of P and the v;) which is such that 


(asymptotically 
») Pr[u3 | £—5 | &w(a)] =P. (3-1) 


Dr ; 
E a $1, u(a) actually depends only on the ratios of the sample variances or 
fos Sino ; c vet be written u(r), where r;=w;/w, but the above form is more convenient 
E es ee Tt must be remembered that both sides of the inequality in (3-1) are 
aa : : iables, and that w and $ depend upon the q;, as well as u(a). 
E stre : hod of solution is basically that due to Welch (19474). It turns out to be possible 
s u(a) in the form 
u(a) = ug(a) + (4) + oO + +> (3:2) 


Wher 
ee : i AO n 
ach (a) is of order v7, and where, if X49 usla) is written in place of u(a) in (3-1), the 


equality i ; ipsa 
lity is satisfied to within terms of order v--1. Large-sample theory gives the initial 


ter 
7m Uo(2) = y, where x 
en e Y dt - P. (3:3) 
Th ae 
e fir : : 4 F 
p corrective term can be derived by making the appropriate substitutions in the 
al theory presented in a previous paper (James, 1954, equation (2:53) or (2-59)). Itis 
T; 
x- (3-4) 


=, 
P. 
Vi 


"5 " 
ma) = 108 — 7X) 2, + 2X 
t 
at was there called the ‘general 


an example of wh 
estimation; that is to say, 


s first correc 


or à A i 
nalytie complication, and only thi 
where the estima 


Prob 
in a considered there. (For the ‘special case’ 
Cory, 9 linear hypothesis are functionally independent of the variance estimates a second 
for the particular problem of weighted means the 


volved nageable, although very lengthy calculations 2 

problem (ies Aspin recorded that the algebra for the can eas en in ne s 

Present ran to more than 100 pages; anyone wishing to M e the same erm in the 
1 problem should be prepared for considerably more work than this.) 

z following is a brief summary of Welch’s computational technique. The left-hand side 

'1) can be written as the average, OVeT the distributions of the variance estimates d;, 


oft 
(3. he conditional probability that wt |&—-/ | &w(a) for fixed t; That is to say, our equation 
for u(a) can be written schematically as 
je [wt | £—/^ | <u(a) | a] Pr [da] - P. 
ing terms of order 1, v 


(;%;/@) and its derivatives with 


distribution of $ are regarded as 
S and 


| 


(3:5) 


series contain: -1 y2, ..., and 


here 7 denotes x 
ally in the 
4 those involved in the eapressions W, 


20-2 


Th 

de hand side can be expressed as à 

"éspect ng upon Pr [wt | z— | <ula)] (w. ; 

Const to the o. (The æ; involved parametric 
ants in evaluating these derivatives, but no 


308 Accuracy of weighted means and ratios 


; "ders on both 
u(a).) The series (3-2) is now substituted for u(x) and terms of corresponding or 2 d nem 
sides are equated. There results the following set of equations for us(a), w;(a), ... 
for ugla), u,(a), ...: 


3:6) 

Pr[oi|z—54|«wy(x)-—P, giving w(x) =%(a)=x, ( 
(377) 

Di (2) D + Xo30i[v;] Pr [o | 257p | & x] — 0, 

[us(a) D + buila) D? + (Xa382]v;) u (x) D aut 


z =0, ( 
+ $203 n + 3Xodo30205|v;v;] Pr [o | &— s | & x] 


[us D + us (2) w(x) D? + dad (2) D? + (Ea 82]vi) (uala) D + (a) D?) 
+ ($Eotei[vi + 3Xado$0105|v;v;) u (x) D + 2:033] is 
tirata piv, + Rota a 303 95v vv] Pr [wt |E- | & x] 7 9- 
r a 
Here 90; denotes 3/ðx; (interpreted in the sense indicated above) and D denotes qu a 
term like o20tu,(x) D Pr [...], 0? operates on «,(z) and on Pr [...], but not on the i vasti 
The derivatives with respect to the æ; are most readily evaluated by the Taylor "Y "Y 
technique used in my 1954 paper. After evaluating all the derivatives involved i 
(3:8) and (3-9) and performing the summations we finally find the result 


ula) or w(r)= XL1 + (h(x? T) Us + 204) + (A(x! — 8132 + 93) Us, 
FEL - 61) Us + 8U, + 3,( — IX! + 2082 — 543) UZ 
FEX ET) UU, -2U8) 


+ {5(5x° — 4174 + 4957? — 8997) Up 

+5(28x4— 6432+ 1668) Us, + 11 (72 — 33) Ugg + 32, 

+l 335-1843 — 18523? + 2973) Ups Uy 

+79 — x 3133 — 93) UU, + 3(— 994+ 19842 — 4677) Uzo Un 

*EBC- 11%” + 61) UU, + 6(— x?4- 7) Ui; Uy, — 1602 Ui; 

+ sz (2435 — 12053 + 1055692 — 153063) UŽ 4.10) 
+ ye(9X'— 208? + 543) Us, Un - (x? — 7) Un Ut; + 4U$3] + ow). ( 


41) 
(1 
where U,—-XhwW, r-ww. 
:nstead 9 
As a check on the lengthy algebra involved the calculation was repeated, but m$ 
starting with (3-1) the equivalent equation 


12) 
e 
Pr [w(? — 1)? < 2h(a)]=P 
for 2h(a) = [u(a)? was used. The zero-order approximation to h(a) is £ — 1y?, where 
E (3 13) 
mif ttetdt= y 

i d 

ate 


T : ; lu 
and writing h(a) =£+h,(a)+... we complete the solution as before. Finally, w(a) is eV 
as the square root (to order »-3) of 2h(a). 


G. S. JAMES 309 


4. TABLES FOR THE TWO-SAMPLE PROBLEM 


| In th 
| e case k= 2, u(r) is a function of the single quantity 


E 
OL , 
Wi T Us (41) 


ES from n, v, and P. 
he v. 
| Pte o oy l — ee bes aes am = an ordinary desk computer, using 
and 0-01 were worked out on the aut aes "s "digi ps ud m inea ie 
eie omatic digital computing machine at Manchester 
edema have here been tabulated after rounding off to two decimal places. For 
Eus a ep of freedom tables of the normal distribution show that a rounding-off error 
Eos n oe, : change in probability varying from about 0-002,1 when (1— P) — 0-10 to 
Jes nem z hen (1 + P)= 0-01; that is, to put the most unfavourable light on it, the 
evalue ot : ue to this cause is less than 3 % of (1 — P) for the values tabulated. Since 
Bandi om oY ) are always at least as large as the values with infinite degrees of freedom, 
of freed E E rors are likely to be even less serious in general. Moreover, when the degrees 
har "e Des only moderate a very considerable averaging effect takes place, for r varies 
tes = eto sample and some tabular values are likely to be too small and others too 
shows S PEN in an Appendix to Mrs Aspin's tables (1949) for the v-problem, Welch 
, in the particular case (1 — P)=0-10, v, =.= 6 (our notation) and equal popu- 


&tior 2 
n variances, the true probability using the two-place tables never differs from the 
ind 0-002,1 suggested above. I 


Domi, 

havs nal by more than about 0-000,4, compared with the bot 
I gin s à s 

Some = done any similar calculations using the two-place tabular values given here, but 

give 9 etails of the performance of the five-place figures from which they were derived are 
nin $6 

Inno 3. 

deciding on the minimum values of 


ado 
s * was that alterations of at most two or 
Al place, should some improved method of caleulation become available in the future. 


he next term would be' is a dangerous procedure, but the third 
ater than 0-04; this value occurs when (1 — P) — 0-01, 7, — v9 — 10, 
of several units in the last place has been accepted 
produce tables which are mathematically correct, 
elo] ]. The standard is lower than that imposed in the 
n LAS tables, where not only is the term of order -* available for evaluation, but the 
Tequi itself is more quickly convergent (in the practical sense of the word). Ifit were merely 

red that the probabilities should be close to their nominal values it would no doubt be 


Possi 
sible to extend the range (cf. Table 6-1). 


to be given in the tables, the principle 


v, and v, 
ssary in the second 


three units shall be nece 


oer guessing ‘what t 
Fay ueni. is Ever gre 
£8 com T. A possible future change 
promise between the desire to 

Ones which shall be practically usefu 


5, INTERPOLATION IN THE TABLES 

arly with respect to all three arguments, 

everal units in the second decimal 
be used for the degrees of freedom. 
v, in the panel which includes 


interpolate line 
lead to an error of s 
ation should 
be used for either 


Por most 5 TUS 
altho I Į MEBOROS it will suffice to 
ace, EN in certain cases this may | 
| armo 7p REE of doubt, harmonic interpol 
enm nic interpolation should in any case 


310 Accuracy of weighted means and ratios 


6. THE ACCURACY OF THE SOLUTION 


The performance of the solution (3-10), or of any other proposed solution, may ba de 

by numerical integration. This process could no doubt also be made the basis of an iter F A 

method of calculating u(r), like that described by Trickett & Welch (1954) for the dua: 

problem. The integral to be evaluated can be derived as follows. T 

The quantities y? — v;o»;/w; are distributed independently of the x; and of each othe 

x?-distributions with v; degrees of freedom. Write 

d 

=I k-XjEg Gh-n. e 


NES the 
Then x? has a y?-distribution with v, — Xv; degrees of freedom, and is independent of 


distribution of the b;, which is 


_ D») MÀ Sb. =1 
2(b)db Trg IP I'd; (b, ...,0,2.0; Zb;— 1). 


il 
II denotes a product over i— 1, ..., k, and I’ a product over i= 1, ..., (k— 1) (say). We east 
find 


wi(4—p) = Ew(v-n) | Enol: — p)]b; 


wt QPXv;o;[b;* 
E [Eri Mb (Ey | | Er? p;/b? l (6:3) 
(Zrob) Wa Vv; pifbi] " 


z 
where (6 ) 


p,—ojo, Xp;-l. 
- "T : ith 
Now the conditional distribution of the first factor in (6-3), given the b;, is that cons 


al 
vs degrees of freedom, by the usual rule. (Its marginal distribution is the same, but We ™ 
no use of this fact.) We also have 


u(r) or u(r) =a (see) : i 
Therefore Prit | 2-4 | <u(r)]= | Pr [wt | $ u | <u(r) | b] Pr [db] (cf. (3:3) 
-Jr- [s eio Ü 

where 7? denotes the ordinary t-integral: 
(67) 


RW) =BG, wy} f » EEEN 
—=t/ vv 


The integral in (6-6) is a (k 


rti r cas? 
— 1)-fold one over the distribution (6:2). In the par tioula 
k=2 (6-6) may be written 


Í 1 E (eoru vun v,pb' ) bpd (6°8) 
a 3 vipb’2 + v3p'b? u vipb' EH Vap'b Bk, i») á 


G. S. JAMES 311 


Where p — o] 

4/9 — (1 — p') and b’=(1—5). The integrals n implify in variou 
zi i . grals (6-6) and (6-8) simplify i ri 
Ways for equal degrees of freedom, and for p; oc 1 m For ae = m ai 


k=2, y 2v 2v p=}, 


(6-8) become i 2bb' y by 
S P = T n] Gb 
I, = [s T) E ) B(àv, v) s (63) 


r) is of great assistance in its numerical evaluation. 
: 


The simpli ; 
e simplification in the argument of u( 
erpolation is necessary in the tables of 


No integration formula is used no int 

The cire Map is of the simplest kind with twenty panels. 
place ale e pei in Table 6-1 have been obtained in the case |; — 2, using the five- 
Eroxia A u(r) from which the two-place tables were compiled, and also the lower-order 
ations correct to v-?, v, 1. The last-mentioned is the ordinary large-sample 


Approximation. 


Table 6-1. Actual values of 100 (1 — P) compared with nominal values 


z 
Nominal value of (1— P) 
Ord SEE. f 
approximation | 10^], 1*j, 
In the ]/y, (vy, =.= 8) (v, v9 — 10) 
p=0orl | 0-2 or 0-8 0-5 p=0orl 0-5 
a 13-86 | 15:87 16-54 2-76 3.33 
"ie : 10-38 | dis 10-95 119 p 
Camio: 10-03 — 10-07 1:02 SEN 
Third 10-00 | 9-99 10-02 1-00 1-01 


% point with v,=v_=10 and p= 0-5 were carried 
f the third corrective term, us(7), in the region 
yis 1 % for practical purposes, even although 
bability over three times as great. 


Ene calculations for the two-tailed 1 
rao. R of the large value (about 0-04) o 
the lan "7. Ibis seen that the actual probabilit 

Be-sample procedure would give rise to a pro 


7. CoNFIDENCE LIMITS FOR RATIOS 
ful extensions of a method which was first given in 
results all have to do with finding limits for ratios. 


ae procedure is as follows. Suppose that y and z are normally distributed, unbiased 
Al gà. eo 3; and £, and that their variances are Ac? and ae and that their covariance is 
PA ; A’, A" are known constants, but ois unknown and is independently estimated by s 
... On v degrees of freedom. It is required to find limits for ji — [C from the data. Now 
t H2)]s(A — 24A" + eX’) has the 1-distribution with v degrees of freedom. Thus if denotes 

value only exceeded (numerically) with probability (1— P), we have 


Pr[(y— ne)? «P5 0. — 2p" +X) =P. (2) 


Fin 
itg ES (1950, 1952) has given some use 
` Beneral form by Fieller (1940). These 


312 ` Accuracy of weighted means and ratios 


Thus a corresponding confidence region for j consists of those values which satisfy the 
inequality on the left-hand side. The typical case is when 2? — #2sA’ > 0, that is to say Me 
the denominator z is significantly different from zero (at level (1— P)). In this case the 
region is an ordinary interval, confidence limits for j being 


yz — Üs?À" + ts[(22A — 2yzA" 4-y?A") — t2s2(AA’ — A25 


Ho: fy z—BSY 
m—ga" |X + (ts[z) [(A — 2mA" + mA’) 2 g(A — A'2]A')] (7-2) 
= ii . 
3 
where m=ylz and gz-tXs. (7) 


But it is important to realize (because similar phenomena can occur in the generalization, 
considered below, although they will not be explicitly mentioned) that if z2 — {PsP «0, L0- yd 
g> 1, then the limits (7-2) become exclusive; that is to say, the region consists of the 
parts <j, and p> jr (taking ji « jt), and does not exclude the value ~=00. It is ee 
possible for the discriminant of the quadratic to become negative in this case, when ai 
confidence region becomes — oo < y; < co; that is to say, the data are in agreement (at the A in 
significance level) with any hypothetical value of 4. These phenomena are discusse li 
detail by Fieller (1954). The second form of (7-2) is useful because when g is very sme o 
reduces to the result that would be obtained intuitively from the evaluation of the ient is 
of m using ‘statistical differentials’ (and ignoring the inconvenient fact that this variance 
really infinite). d to 
Finney's generalizations make use of the Behrens-Fisher distribution, and lea oW 
fiducial intervals, but not to confidence intervals in the direct probability sense. We! pi 
consider how these problems can be dealt with using the Welch—Aspin tables of v and 
new tables of v. of V 
First suppose that the conditions are the same as before, except that the variances y) 
and z are Ac? and A'c^2, while their covariance vanishes. Independent estimates 32 ant ero 
of a? and 6^2, based on v and v degrees of freedom, are available. Then (y—/) aed 
expectation and variance Ag? + /?N'o" estimated by As? + PA's’? Thus ) 
T4 
Pr[(y— uz? « (v(c)? (As? + 2A's’2)] =P, p 5) 
9 
where c— As?[ (As? + u2A's'2) ee 4 
n 
and v(c) is the value given by the Welch-Aspin tables for v and v' degrees of "— 
a P. In the case of a well-determined denominator (7-4) gives the confi 
mits 
m + (v[z)[(1—g) As + m2A's'2]3 (r6) 
l-g á 


Ha; fy = 


the 
ta? à D ae 
where g — v?2'52/22 and v is short for (c). Since c, and hence v, depends on z; OF #2 ately: 
case may be, calculations have to be made iteratively using (7-5) and (7:6) alter" 


As an initial approximation one could take Jom in (1:5). 

Now Suppose that 7/6 =...=7,/G,=, and that we have unbiased and norm? nce 
tributed estimates Vr -+13 Yk Zis +++, Zp OF Mis Ls, Nr» Cys ---» C, The variances and eave at? 
Yis % we A;01, A107, A782, but different pairs are independent. Independently distri 
estimates s? of the o?, based on v; degrees of freedom, are available. We require 


lly Ga 


find 


G. S. JAMES 313 


confidence limit: 

s for y. Now z;—y;— A 
; p ; ;— #2, has zero e ; i 1 
(A;— 2 A +u? A) st. Therefore * E. ed xpectation and variance estimated by 


w;-— (A; — 24A, + 11) 3 = Zw; 
a i ;-24M + ui) 3, w= lw, (7-7) 
Pr [(Zwjz;? < {u(r) Pw] =P, (7-8) 
Where dS. 1/(A; — 24A + 191) $i 
(1:9) 


i5 o 7 EDITOS 39 + PAD SA 


and ulr) i : 

chien m cm value given by the tables in this paper (or their theoretical counterparts 

confidence or the specified degrees of freedom and probability. Thus the corresponding 
e region is that of those values of w which satisfy 


y Yi— Ki | E " l 

a z——.—2:m.3| SLUT. BEL SCRI LENA 

a era] SU 07? r ie. 
which has to be considered in conjunction with the 


This 
rather complicated inequality, 
tical usefulness, although one imagines 


va ues f 
€ = the r; given by (7-9), may be of limited prac 
1 i ue 
Whose en n the denominators z, are well determined it would give a single confidence interval, 
Simple Pisis could be found by an iterative procedure (cf. example 8:3). But a very 
e occurs when the constants A,, Aj, A; are proportional, that is to say when 
Yo (Ais A5, AD) = (À, A, AT) i, say. (7-11) 
T exam: ree " 
ple, the quantities y; 2; (Ù= L, «+> k) may have arisen from i: different experiments, 


f fund 

a 
Coy mentally the same structure, but with different amounts of replication n. Then the 
(7-10) and (7:9) become simply 


AStants w 
nts x; are proportional to 1/n;. In this case 
(9 — 22) < ulr) SA — 24A" + PA’), (7-12) 


wl 

lere 1/K;8? 1 1 
mE X 713 
7 Xüjgsy 87 «e no 


(The estimates of the variances 
iately calculable from the data, 
different from zero (7:12) gives 


and 9 
2 : 
a P itid weighted means calculated with weights 1/k;57. 
Without vm of 9, 2 are (A, A’, A") S°.) The r; are now immedi 
ny iteration, and in the case when 2 is significantly 


once 
the confidence limits 


apy A _ oma" m3) -g(AÀ— vay 
ete gat [X + (uS[£) (A = 42A) - g(A - A1IA2)] (714) 
Where =g 
capes m=9/2, g =w S], (7-15) 
l uUu is Short fux u(r). 
rs the case where the weights used have been chosen 


8o ‘on reference to the experimental data. This problem also admits à confidence-limit 
isn x but in this case we must use the Welch—Aspin v-tables instead of the w-tables. It 
to in Cessary to choose the same weights for the y'sand the z's, provided that it is possible 
Weights that y=. =); and &= -5 (,. Thus suppose it has been decided to use 

1 (independent of the s?) for the y; and W; for the 2;, with EW,- XW;-1.LetjandZ 


314 Accuracy of weighted means and ratios 


denote the means XI; y; and XW;z;. Then the result is that the confidence limits for p (for 


a significantly non-zero 2) are 


m —gV"|V' + WBV —2mV" +m? V’) -g(V — V^] yy (7-16) 
Has [a= = = 
where m=9/Z, g-wV[ v= v(c), (117) 
and y-XWngs, WHEW PAS, V'-oXWWQMS (7:18) 
ü (W2A, AWWA + jW 2A) S (7-19) 
a ti = SS, 20 WAL + EW AQ st 


: r in (719) 
Equations (7:16) and (7-19) have again to be solved iteratively, with x= %1 0T Jta H1 (3d 
D 


as appropriate. This method seems to have no particular advantage over that using W® 
supplied by the data themselves, except when data from several independent sources a 
being combined, and the A's are not proportional; in this case the simpler calculatior 
might be a determining factor. 


are 


8. EXAMPLES OF THE USE OF THE TABLES 


Example 8:1. Twenty determinations of a quantity using a method of relatively i 
accuracy gave a mean result of 10-31 with a standard error (19 n.r.) of 1-08. A further f uj 
ten determinations using a method of higher accuracy gave 11-29 + 0:67. Assuming 
both methods are unbiased, find 95 % confidence limits for the true value. 

Since the standard errors already refer to the means, we have 


w= 1] (Ay 83) = 1/(1-08)? = 0:857, 
Wy = 1](Ags3)  1/(0-67)? = 2-228. | 


Thus a — (0857) (10-31) + (2-228) (11-29) 
0-857 + 2-228 


—11:02, 


and : 0-857 


r= 035712328 0278 


Interpolation in Table 2 gives u= 2-29. Thus the required 95 % confidence limits are 
11-02 + 2-29]A/(0-857 + 2-228) = 11-02 + 1-30. 


, say 

Example 8:2. Finney (1952, $9-3) quotes data of Gridgeman (1944), obtained in EM pat 
of vitamin A, using growth rate in male rats as response. Thirty pairs (litter mates $ ase 
were used, ten pairs for each of three dose levels (0-9, 1-5, 2-5 units of vitamin A? ^ ead! 
of the standard preparation; 0-45, 0-75, 1-25 mg. of the test preparation). One © n ind 
pair of rats was assigned to the standard preparation and the ‘other to the corresp? 9 
dose of the test preparation. If logarithms of dose (for convenience taken to tho d pe" 
are used as dose metameter, and growth in grams over 3 weeks as response metametet ag? 
the dosage-response curves become nearly parallel straight lines, and the three ste jot 
values are spaced one unit apart. The experiment can be analysed as a typical ‘ep pest 
design, ‘whole plots’ being pairs of rats and ‘split plots’ being individual rats. T^ pisis? 
difference in response between preparations (test minusstandard)isy = — 5-2338™ b 
comparison made within pairs of rats, for which the error mean square in the BU" 


Q. S. JAMES 315 


vari " = 

"v H D.F.). The slopeof the dosage-response linesisestimatedasz = 9:350 gm. 
ES i yes p ose; but this isa contrast between pairs of rats, for which the error mean 
squares oi dd p.F.). According to the usual assumptions, y, z and the two error mean 
, am Me order If the true potency of the ju preparation is p units/mg., then 
DE ration. € is estimate of x = log; 0-3p (since the nominal potency of the test 
Thus the € ken in| Š account when fixing the doses to be administered, was 2-0 units/mg.). 

potency is estimated as 
R-232(5)-*99 = 1-50 units/mg. 


Now : . 
ow the estimated variances of y and z are As?=39:35/15= 2.624 and X's"? = 9860/40 = 


2-465 : 
059 pour p= —0:5597 as a first approximation, equation (7:5) gives c— 0-77. For 
Bison (7. the first table in Trickett ef al. (1956) gives v — 2:02. Thus g=0-115, and 
n (7-6) ei j 
] gea j= — 1082, flg= - 0308. 


Substitut; 
ee these values back into (7:5) gives = 0-486, cy = 0-963, or v — 2:00, v, = 2:03. 
are only about 1% different from the original values of v, and in practice it would 


Prob 
ably not be worth while to carry the calculation further. However, substituting these 
d v, with the plus, we find the corrected limits 


Value; ^ i 

sin (7-6), using v, with the minus sign an 
j= 1:055, k=- 0-198. 

ts for the potency are 1-17 and 1-81 units/mg. 


Th 
eee 95 % confidence limi 
ace ur 83. Finney (1952, $12-5) gives as another example data discussed by Bliss & 
calcium J 0) relating to an assay of parathyroid extract, the response being the serum 
airly co evel in dogs injected with the standard and test preparations. The design is a 
i mplicated one, based on à balanced incomplete block arrangement, and the reader 
c.[kg. body weight 


ls 
o Com to the above authors for full details. Doses of 0:06 and 0-12 c. 
Or o Preparation were used, there being eighteen determinations at each of the four doses. 
ur purposes it will suffice to say that the analysis of variance splits into two inde- 
inter z and intrablock analyses, 


quare. The two parts can 
urposes. The first section 


, 


Providing an estimate of potency and its o 
ts for further computational p 
the quantity Yz Where y; = 4:500, z= 4833, these 
18 E. the estimated variances and covariance (30 D.F.) Aysi= 126-8/6, Ayst= 126:8/24., 
Aving' | The second section gives the estimate yalz,, where ya — 3:000, ie these 
m te the estimated variances and covariance (30 5.) Ags} = 61°35/12, Ass 61-35/48, 

- It will be seen that the A's are proportional, and we may take, for example, 
(Ap A520 = 052^ N') Ke = ( 


Y- d. É 
in PULS 


using weights 1/ (182); 1/(KgS3), are 


Whe 
Ti 
oe on (7:11))- The means, 


1=2, Ky=1 (see equati 
Thus f g=3292,, 2=6-108. 
"om (7-13) and (7:15). 


m=0:5390, "= 0195, S?= 49-40. 


Pr 
0 : 7 à 
Um Table 2, the appropriate value of u for 95 % limits is 2.07, so that g — 0-118 and (7-14) 
8 the limits 
==0:151, # 


he ‘a= 
Corr, 
°rresponding relative potencies are 0-949 and 1-61. 


gee 1:373. 


316 Accuracy of weighted means and ratios 


; M i tion 
Uptillnow we have neglected a complication in Finney's analysis. This is the ee (No 
of the body weight of the dogs as a concomitant variate in the intrablock analysis. 


à : : is yields 
significant reduction in error variance was obtained in the interblock analysis.) This y. 
the modified values 


Va — 2-511, 2,— 6-270, 5$— 49-74, Aş = 0-083,93, AS=0-020,89, AZ = 0-000,18. 


: F introduction 
The error degrees of freedom in this section of the analysis are now only 29. The pee A 
of the corrections for body weight have caused a slight correlation between yx an te ad 
the A's are no longer quite proportional. However, equations (7:9) and (7-10) pes m 
after some somewhat tedious arithmetic we find the limits 0-943, 1-53 for the 
potency (P — 0-95). 


9. MULTIVARIATE GENERALIZATION OF THE WEIGHTED MEAN PROBLEM 4 
ME 8 


The problem considered in $1 can be generalized as follows. We suppose that X; -- puted 


k p-variate vectors (for example, the means of % p-variate samples) which are pe y 
independently and normally about the same mean-point (centroid) £, but with P mates 
different dispersion matrices a, ..., ay. The latter are unknown, but independent 1 oso ar? 
of them (for example, the dispersion matrices of the k samples) are available; th 


vlan 
à Ce uei m z EI 
supposed to have Wishart distributions with V, «++, Vy degrees of freedom. If Wi=% 


W=W, and $-waA EW;x,, 
then it is known that if the v; are large then (€ — £)' W 


bution with p degrees of freedom. More precisely, making the appropriate substit 
equation (5-28) of James (1954) we find that if we define 


(9:1) 


4 distri- 
(& — E) has approximately à sitions P 


2 Z / 
2b-yeirl 200 1) e wa w, +(X -2% tr (Wa W;y 
à Vi p "o Mo-23) "p 


r 2 2 92) 
*s( x -3%) (tr Wt wy) 
2\p(p+2) p (9:3) 
where (rap f^ tio etit =P, 

0 (9-9 
then Pr[((&—£) W(&—E)« 2h]- P -- O(v-3). with 
This equation can be used to give an ellipsoidal confidence region for the vector e y 
confidence coefficient a 


so’ 
pproximately equal to P, if the degrees of freedom v; are re? 
large. 


I would like to acknowled. 


{ing 
ds 
and Mr J. F. P. Donovan, 


! : iss D. E P 
ge the assistance of Dr D. W. J. Cruikshank, Miss ves à 
of the Department of Inorganic and Structural Chemist?) P utet 


TU PP aie "i ic digital co™ 
University, in facilitating the use of the Manchester University automatic digita 
(Mk. T). 
REFERENCES oplem ? 
"m pr 
ASPIN, A, A. (1948). An examination and further development of a formula arising in the P telY 
comparing two mean values, Biometrika, 35, 88-96. x S sepa!" 
ASPIN, A. A. (1949). Tables for use in comparisons whose accuracy involves two variance? jh 
estimated. Biometrika, 36, 290-6. panier 
Beurens, W. V. (1929). Ein Beitrag zur F 


ehlerberechnung bei wenigen Beobachtungen- 
68, 807-37. 


G. S. JAMES 317 


Buss, C. I. & Roser, C. L. (1940). The assay of parathyroid extract from the serum calcium of dogs. 
Amer. J. Hyg. A, 31, 79-98. 
Creasy, M. A. (1954). Limits for the ratio of means. J. R. Statist. Soc. B, 16, 186-94. 
Ferter, E. C. (1940). The biological standardization of insulin. J. R. Statist. Soc. Suppl. 7, 1-64. 
Ferter, E. C. (1954). Some problems in interval estimation. J. R. Statist. Soc. B, 16, 175-85. 
Finer, D. J. (1950). Two new uses of the Behrens-Fisher distribution. J. R. Statist. Soc. B, 12, 
293-300. 
Pinney, p, J. (1952). Statistical Method in Biological Assay. London: Griffin. 
Fisuer, R. A. (1935). The fiducial argument in statistical inference. Ann. Bugen., Lond., 6, 391-8. 
ISHER, R. A, (1939). The comparison of samples with possibly unequal variances. Ann. Eugen., 
Lond., 9, 174-80. x . 
ISHER, R. A, (1941). The asymptotic approach to Behrens’s integral, with further tables for the 
d test of si nificance. Ann. Eugen., Lond., 11, 141-72. . 
Tsumr, R. A. & Yarns, F. (1953). Statistical Tables for Biological, Agricultural and Medical Research. 
4th. ed. Edinburgh: Oliver and Boyd. , 
"RIDGEMAN, NT. (1944) The Estimation of Vitamin A. London: Lever Brothers and Unilever Ltd. 
AMES, Q. S. (1954). Tests of linear hypotheses in univariate and multivariate analysis when the ratios 
of the population variances are unknown. Biometrika, 41, 19-43. y 
Tarrrevs, H. (1940). Note on the Behrens-Fisher formula. Ann. Eugen., Lond., 10, 49-01; "E 
Mauxpon, J. G. (1955). Pivotal quantities for Wishart’s and related distributions, and a paradox in 
P fiducial theory. J. R. Statist. Soc. B, 17, 79-85. m. 
EARSON, E. S, & HamrLEY, H. O. (1954). Biometrika Tables for Statisticians, 


niversity Press for the Biometrika Trustees. 


1. Cambridge: The 


Soiree; P. V. (1938). On Fisher and Behrens' test of significance for the difference in means of 
two normal samples. Sankhyd, 4, 39-48. ^o 
Ties W.H.& Wanon, B. L. (1954). On the comparison of two means: further discussion of 


iterati r ing tables. Biometrika, 41, 361-74. . 

KETT, W gee oy iem G. S. (1956). Further critical values for the two-means 
Problem. Biometrika, 43, 203-5. ' ' 
Weron, BiTi: diel, ihe generalization of ‘Student's’ problem when several different population 


Varian i . Biometrika, 34, 28-35. t d a. 
Won, B.L. prb iln Studentization of several variances. Ann. Math. Statist. 18, 118-22 


igni ial 
ATUS, E, (1939). An apparent inconsistency arising from cry r.a based on fiducial 
istributions of unknown parameters. Proc. Camb. Phil. Soc. 35, 57 s 


Trig 


318 Accuracy of weighted means and ratios 


Table 1. Upper 5 9, critical values of w= ($— p) (n + wə) 


(i.e. upper 10 % critical values of | |) 


| wy | 
r= AM 


- 0-0 0-1 0:2 0:3 0-4 0-5 0:6 0-7 
| | Wy + We 
| 
Y. | n | 
| 6 6 1:94 205 211 214 215 215 215 214 
| 194 201 205 207 208 207 207; 2-06 
10 1:94 1:99 202 2:03 2:04 203 2-02 2-01 
| 15 194 197 198 199 1:99 198 197  L95 
20 | 1:94 1:96 1:07 1-97 1-96 1-95 1-94 1-92 
| oo 1:94 1:94 1-93 1-91 1-90 1:88 1:86 1:83 
8 6 | 1:86 1:96 2-03 2-06 2-07 2-07 2-08 2-07 
8 186 193 197 1:99 2:00 200 2-00 1-99 
10 | 186 1:91 194 196 1:96 196 196 1:95 
| 15 | 1:86 1:89 1:90 191 191 191 190 1-89 
20 | 1-86 1:88 189 189 189 189 188 1-86 
co 186 1:86 1:85 1-84 183 1:82 1:80 1-78 
10 6 | 181 1:91 198 201 2:02 2:03 2:04 2-03 
8 | E81 188 192 1:95 1:96 196 196 1-96 
10 181 186 1:89 191 192 1:92 1-92 1-91 
| 15 1-81 1:84 1:86 1:87 1:87 1:87 1:87 1:85 
| | 20 181 1-83 1:84 1-85 1-85 1:85 1-84 1-83 
| oo L81 181 181 180 179 178 176 175 
| 
15 6 A 185 191 195 197 198 199 1:99 
8 | L82 1:86 189 1:90 191 191 1-91 
10 | L80 183 1-85 187 187 187 1:87 
15 | 178 1:80 1-81 1:82 182 1-82 1-81 
| 20 L77 179 179 180 180 179 179 
oo 1-75 1-75 1-75 1-74 1-73 1-72 1-71 
20 6 L72 182 1:88 1:92 194 195 196 1-97 
8 1-72 1:79 1-83 1:86 1-88 1-89 1:89 1:89 
10 172 177 180 1:83 184 185  L85 1-85 
15 172 175 177 1:79 179 1-80 180 179 
20 | 2 Ta L0. a D Pa wm y 
eo 172 172 172 1572 171 171 15470 169 
co 6 164 1-74 1:80 1:83 1:86 1-88 190 1-91 
8 164 171 1:75 178 180 1:82 1-83 1-84 
10 164 1-69 1-72 175 1-76 1-78 1:79 1:80 
15 1-64 167 1:69. 1-71 1:72 173 174 1-76 
20 164 166 168 1:69 170 1-71 171 1-72 
oo 1:64 1-64 1:64 1-64 1-64 1:64 1:64 1:64 


2-11 
2-03 
1:98 
1-91 
1-88 
1-80 


ww 


2-05 
1:97 
1-92 
1:86 
1:83 
1475 


2-02 
1-94. 
1:89 
1:83 
1:80 
1:72 


1:98 
1:90 
1:86 
1:80 
1:77 
1:69 


1:97 
1:89 
1-84 
1-79 
1-76 
1:68 


1:93 
1:85 
1:81 
1:75 
1:72 
1-64 


2 tas i i i a 
; v Esse E Us aa) pan) is the weighted mean of two independent normally distributed v ind? 
x, and 25, which have the same expected value jt, and variances Ayo? and Ago} respective pl 


n E * 2 1 V, 
pendently distributed estimates sj, så of of, of based on v,, v, degrees of freedom are aval” 


the weights used are w, = 1/(A st), w = 1/(A,83). 
For example, if the x; and s? (‘=1, 2) denote the means and variances of two sam 


ples of ° 


taken from two normal populations with the same mean H, then v; = n;— 1 and A;= l/ne 


ines ™ 


G. S. JAMES 319 


Table 2. Upper 23 % critical values of u= (£— x) (w+ we) 
(i.e. upper 5 % critical values of Jal) j 


wi . | 
ETA 00 01 02 O03 04 O05 206 OF 08 209 L0 | 
Vs » 
$ 8 231 24; 247 247 245 243 
10 | 231 2-41 240 2.39 235 
12 231 23; 237 236 234 231 
ls | su 234 233 232 230 236 
20 231 231 230 228 226 222 
se 231 2.22 217 214 2-10 
"d 8 | $239 230 235 239 240 241 24l 240 238 235 
10 2.93 227 231 233 235 235 235 233 231 227 
12 293 226 239 230 231 31 230 229 227 223 
15 223 225 227 228 228 227 227 225 222 218 
20 223 224 225 225 225 224 223 231 218 214 
œo 223 221 220 218 216 2l 212 209 206 202 
F 8 2-18 231 234 230 237 237 237 2360 234 231 
10 | 218 227 239 230 ] 231 230 229 220 223 
12 | 248 994 226 227 227 227 2.96 224 221 218 
15 | 918 2.99 223 224 224 223 222 220 237 213 
20 2-18 220 221 221 220 219 218 216 213 209 
© | 218 aig ait 209 Su 2.00 meats 2.04 200 1-96 
H 8 213 220 226 230 232 233 234 234 2.31 
10 213 218 222 225 227 227 228 228 2.23 
12 213 3217 220 222 223 224 234 223 2.18 
15 213 216 218 219 220 2:20 220 219 2-13 
20 | 213 215 216 217 217 217 216 215 2-09 
c | 9143 9212 211 210 209 208 2:06 2-04 1-96 
E 8 209 216 222 226 228 230 2 231 232 231 231 
10 | 209 214 218 221 223 234 2 295 225 224 2-23 
12 | 209 213 216 218 219 220 2 221 220 219 218 
15 209 211 214 216 216 217 2 217 216 215 213 
20 209 210 212 213 213 214 2 213 212 210 2:09 
Se 2.99 208 207 207 206 205 2 202 2:00 198 1-96 
ie] 2 3 
s 2 : old 217 220 222 234 227 229 231 
10 oe oe au sm sie Sak $015, 9000 221 223 
13 1-96 2-00 9.04 2:07 209 211 213 214 2-16 2:17 218 
15 1:96 199 2:02 9.04 206 2-08 2-09 2-10 2n 212 2:13 
20 196 198 200 2:02 2.00 205 2-06 207 207 2.08 2-09 
eo 1:96 1:96 1:96 1-96 196 1:96 1:96 196 1:96 1:96 1-96 
SM 
RACES Jott 1 TT ‘eichted mean of two independent normally distributed variables 
en aay Bec gs x kekten i value ji, and variances Agi and Aso} respectively. Inde- 
the sod ntly distributed estimates si, s3 of oi a based on V, v; degrees of freedom are available, and 
= 252). 
Vie ariances of two samples of sizes n; 


We 
or D used are w, = 1/(A183), We 
taken p ample, if the x; and s? (i=1, 
Tom two normal populations wit 


the means and v 


2) denote 
) ean jt, then v; =n:— 1 and À; = 1/n;- 


h the same m! 


320 Accuracy of weighted means and ratios 


Table 3. Upper 1 9/, critical values of wu — ($ — p) (w; + wg) 
(i.e. upper 2 % critical values of |u|) 


| 
r-— L| 00 01 02 03 04 os 06 07 08 09 
Wy +H s 
| 
Vs nu | 
10 10 2-76 280 2-84 2-87 289 2-89 2-89 287 2:84 2-80 
12 276 279 281 2:83 283 283 282 2.80 277 273 
15 276 277 278 278 278 278 276 274 270 2-65 
20 276 2-76 2-75 275 274 273 271 2.608 264 258 
30 2-76 2-75 273 271 270 268 265 262 257 252 
co 276 273 269 2-66 2-63 2.59 255 2.51 2:46 2:40 
12 10 268 2-73 277 2-80 2-82 283 283 2.83 281 2-79 
12 268 271 274 276 277 277 27; 276 274 27l 
15 268 2-00 271 272 272 272 271 270 267 264 
20 2-68 268 2-08 2-08 2-68 267 266 264 2-61 2°57 
30 2-08 267 2-66 2-05 2-64 2.682 260 2.58 2-54 2-50 
co 268 265 2-62 260 2.57 254 2-51 247 243 2:38 
15 10 | 260 265 270 274 276 278 9.78 2-78 278 277 
12 2.00 2-64 267 270 271 272 272 272 271 269 
15 2-00 262 264 266 266 267 266 266 264 262 
20 2-00 261 262 262 262 262 2-001 260 2:58 2:55 46 
30 2-00 260 2-59 259 258 2.57 2-56 254 2.52 249 233 
co 900 2.58 256 254 251 9.49 2.46 244 240 237 ? 
20 10 2:53 2-58 264 268 271 273 274 275 275 276 ui 
12 253 2-97 261 2-64 266 267 2.68 268 2.08 268 Sed 
15 2:53 255 2-58 260 261 2.02 262 262 262 261 -53 
20 2:53 254 255 256 257 2-57 257 256 255 254 246 
30 2:53 2-53 2.53 2-53 253 253 2.52 2.51 249 2-48 2:33 
co 253 2-51 2-50 248 246 244 243 240 238 235 
30 10 2-46 2-52 2.57 262 2.65 2.68 230 271 273 278 Hi 
12 246 2-50 254 2-58 260 2-62 264 265 266 267 260 
15 2:46 249 252 254 256 257 258 259 259 2-60 2-53 
20 246 248 249 251 252 253 253 253 253 253 5g 
30 246 246 247 248 248 248 248 248 247 246 535 
co 946 245 244 243 241 240 239 238 236 234 
.16 
oo) 10 2.33. 240 246 251 255 259 263 266 269 273 2.08 
12 2:33 238 243 247 95] 254 257 260 262 2:05 2.60 
15 233. 2-37 240 244 246 249 25 254 2:56 258 555 
20 233. 235 238 240 243 244 246 248 250 251 546 
30 233 24 236 238 239 240 241 243 244 245 2.33 
oo 2.33. 233 2.33 2.33 2.33 2.33 2:33 2.33 233 233 ^ 


È= : : , jab 
a (wit, was) [uw + wy) is the weighted mean of two independent normally distributed ee nde 


$, and v, which have the same ex ; ely- d 
S xpected value x, and variances À; g? 2 respectiv gn 
pendently distributed estimates sj, s? of o, oi e ek MD urat 


the weights used are w, = 1/(Ay st), w, = 1/(A,5d 
For example, if the x; and s? (621, 2 
taken from two normal populations wit 


G. S. JAMES 321 


Table 4. Upper 4% critical values of u=(ĉ— ji) A (004 +w) 
(i.e. upper 1 % critical values of | u |) 


p= o9 oL 02 os 04 ós 06 07 08 09 1-0 
Wi + Wa 
Ri N 
3 | vy 
10 10 317 320 334 327 329 330 3:29 3:27 324 3:20 317 
12 317 348 320 321 322 3:22 321 318 314 309 305 
15 317 316 316 316 336 335 313 310 305 3:00 2-95 
20 317 314 313 311 3:10 3:08 3:05 3:02 2-96 290 2-85 
30 3:17 313 3:10 307 305 3:02 2-99 2-94 288 282 2-75 
co 317 311 3:06 3:01 2-96 2-91 286 280 274 266 2-58 
E 10 $05 309 314 32318 321 322 322 321 320 318 317 
12 $05 307 310 312 314 314 314 312 310 307 305 
15 305 306 3:07 3:07 308 307 306 3-04 301 2-98 2-95 
20 3-05 3:04 303 3:03 302 3-01 299 296 2-93 2-88 2-85 
30 305 303 301 299 297 295 292 289 285 2-80 275 
9o 305 3:01 2:97 293 289 284 280 275 270 264 258 
w 10 295 300 3:05 310 313 315 316 316 316 zi EH 
12 295 298 3:01 3-04 3:06 3:07 308 3:07 3-07 rM po 
15 295 296 298 299 300 300 300 299 298 E 
20 2-05 294 2:95 2:95 2:95 294 293 2-91 2-89 aa jos 
30 295 293 292 291 290 288 286 284 281 2 i 2m 
9o 2:95 291 288 285 281 278 275 271 267 26 
" 10 2:85 290 296 3:02 3-05 308 310 3-11 Pn d Pn 
12 285 288 293 296 299 301 302 303 Dd eee at 
15 285 287 289 201 293 294 295 295 205 2 Pus 
20 285 285 286 287 288 288 288 287 286 285 285 
285 284 284 283 283 282 281 280 278 277 275 
30 285 284 284 283 2 8 p rm oan dn 
9o 2:85 282 280 277 2575 272 2570 267 264 2 2 
is 10 275 2:82 2:88 294 299 3:02 3:05 3:07 3-10 B a 
12 275 280 285 289 202 295 297 299 3-01 E He 
15 275 278 281 284 2-86 288 2-90 ai ea A 
- xu pgr ose B om pe 2^ 276 276 275 275 
30 or bye mme wa ae SOT BIE 2 ess 
5m) 275 273 272 210 69 267 265 2 
% Š 01 306 311 317 | 
10 258 266 274 2-80 286 pet as n E nm m 
12 2-58 264 270 275 280 3M Mu dis Amp GNI 296 
n 258 202 207 271 2795 2^ 275 277 280 282 285 
a 208: Bal Fo Ti ZTO 267 269 270 272 273 215 
ed 258 260 262 2 a ^ 2. " 8 258 25 
| oo 2.58 258 258 258 258 2-58 2:58 2-58 2-58 5 8 
= 
is " ro independent normally distributed variables 
5 an fuia, + wgas)/ (wy, + wa) is the p ge o i p, oe A101 and Ags respectively. Inde- 
pende nich have the same expected v based on »;, v, degrees of freedom are available, and 


Wei : 2j qp, = 1/(Aa88)- 
Fop Sats used are w, = A/(Ay st), Wa = Ls 3) he means and variances of two samples of sizes n; 
mean jt, then v; 2 n;— 1 and A; = l/n;. 


Biom. 43 


ON ESTIMATING THE LATENT AND INFECTIOUS 
PERIODS OF MEASLES 


IL FAMILIES WITH THREE OR MORE SUSCEPTIBLES 


By NORMAN T. J. BAILEY 
Design and Analysis of Scientific Experiment, 6 Keble Road, Oxford 


1. INTRODUCTION 
Ina previous paper (Bailey (1956), referred to here as Part I) I diseussed, for families W! 
only two susceptibles, the maximum-likelihood estimation of parameters in an epidem's 
model involving a normally distributed latent period after the receipt of infection, follow? 
by a constant infectious period terminating with the appearance of symptoms and remov i 
of the patient from circulation. The present paper gives the extensions required for dealing 
with families having more than two susceptibles. The case of three susceptibles, which 1 E 
quite common, is described in some detail as it presents a few new features. Larger fail 
can be analysed in a similar way, but the procedure becomes rapidly more complicate d 
especially as the misclassification of links in the chain develops into an item of major M 
portance. Since there is at present little data on such families we shall give an indicati? 
only of how to take account of all the complications involved. 

An illustrative example is provided here for families with three susceptibles usin 
excellent material that has very kindly been made available to me by Dr R. 
Simpson. The data are based on families residing in the Cirencester area during ! 
with three susceptible children under 15 years of age including at least one case of meas 


th 


gate 
. Hop? 


les. 


2. DESCRIPTION OF DATA FOR FAMILIES WITH THREE SUSCEPTIBLES 


a 
As shown in Part I, for families with only two susceptibles, the data, consisting "i 
total of N families with at least one case, fall naturally into three parts. There are 
with two cases, both having been infected simultaneously by an outside contact. to} 

a further B families with two cases, where the second case is derived from the first by 9" for 
infection within the family. We also know the time interval between the two ni 
the first type and z for the second. As a first approximation we assume that these tw s 
which are labelled (2) and (1?) in chain-binomial notation (Bailey, 1955), can be pi 
distinguished. The third type of family, of which there are C, so that N = T 
contains only a single case and is labelled (1). ing 9? 
When dealing with families having three susceptibles the parts of the data involv i ut 
or two cases only can again be described as above, and we shall use the same notation, +P: 
we also have in addition D families containing three cases, so that now N = d iP “a (3) 
These families are of four kinds represented by the chain-binomial symbols a? OE i 
and (21), with actual numbers E, F, G and H, respectively, where D = Eget ase 
This time there are two time intervals to be recorded: u between the first and see”, 5 jen 
and v between the second and third cases. Assuming for the time being that the © | thoy 


types of chain can be correctly identified, we must consider the type of informat! 
provide in some detail. 


Norman T. J. BAmEY 323 


the p rnei families of type (1°). where the second case is derived from the first, and 
E nao aec sesond, both by cross-infeotion within the family. The two intervals, 
B uen: se = y each a ‘z-type variable such as arises from the B families giving a (1?) 
E e d z the F families of type (12), where the second and third cases have been 
ow given b al infected by the first, also provides two z-type observations, but these are 
Which can b yu and u+v. We therefore have in all J = B+2(E +F) z-type observations 
Netw e subjected to the analysis already presented in Part I. 
in Rated i consider the G families of type (3), where all three cases have been simultaneously 
ih terms sd an outside contact. These must be examined separately as they cannot be put 
M Eo. S types of data already discussed. The appropriate analysis is given in the 
iat E families of type (21), where the first t 
eem and the third by cross-infection, present a special difficulty. The first interval w 
families i w-type variable and can be taken in conjunction with the data for the A 
much mo ype (2), giving K =A +H families altogether. The second interval v has a very 
either of ih complicated distribution, since the third case could have been infected during 
amour he infectious periods of the first two cases. It seems better to ignore the small 
nt of information available from this source in order to avoid excessive complexity. 


een families of three susceptibles with two cases 


Tab 


wo cases are derived from a simultaneous 


p interval in days ROUTE. 

No. of families a tb | i a 
Probable type of chain | (2) | a» 
Total no. A | B=11 

LL 


ers of families showing different 


Fi 
mally, we can analyse the data according to the numb 
ain subdivision here depends 


| on pct chain, apart from the time intervals involved. The m 
A ases, A n the disease is introduced by @ single case or by two simultaneously infected 
the las F triple introduction adds nothing further to what has already been mentioned in 
(21) th Paragraph but one. When there are two initial cases, i.e. 4 of type (2) or H of type 
p? e treatment is very similar to that given in Part I for the distribution of B, given 
: When there is a single introduction we have types (1); (13), (13) and (12), with observed 


Num 
bers C, B, E and F, where we write M=B+0+E+F. 
ility of variations in the chance of 


with the possib. 

distribution is relatively insensitive to such varia- 

te unaffected. The mean frequencies of A and H 
hance of infection is therefore ignored so far as 

ver, it has been shown (Bailey, 1953) that 

ceptibles with a single initial case is 

f infection. An additional parameter 


nf id point arises here in connexion 
tions i E As shown in Part I the z-type í 
wi ae hile the w-type variables are qu 
Ose T be unchanged. Variation 1 the c 
the e ections of the data are concerned. Howe 

lies of three sus 
in the chance o 
f the data is analysed. 


dst s 
therefore be introduced when this part o : 
| link e last matter to be mentioned in this section is the question of correctly identifying the 


a 3 of the chains involved. The distinction between (12) and (2) is accomplished in a similar 
. er to that described in Part I. Table 1 shows the distribution of fifteen observations 
21.2 


324 Estimating the latent and infectious periods of measles. II 


of this sort with an approximate dichotomy being made, as before, between 4 and 5 days. 
It is a little more difficult to separate the four groups of families with three cases and we now 
need to inspect a two-way table. The distribution of fifty-seven observations is shown M. 


Table 2. Distribution of time intervals (u, v) for fifty-seven families 
of three susceptibles with three cases 


Time (u) between first and second cases in days 
Total 
0123 4 5 6 7 8 91011 12 13 14 15 16 
=! 
p E 
a | 9 TX é 1 Lt 14 
E Ti Es i 3 31 1 12 | 
z 2 1 à 1 1 ae 8 2 12 
E 3 1 ilb 1 4 
= 4 : 2 2 
Ez] 
E wen 
E 5 1-2 È 
td 6 1 1 
g | 7 j 
8 8 Ul l 
E 9| 1 i 2 
g 10 1 1 | 
$ 1 1 1 | 
= | a} a i : 2 
T 13 1 1 
E 14 1 1 
B 
Total 6 431 333961022311 | 9-P 
T a - 
able 2a. Summary of Table 2 giving probable types of chain with observed numbers: | 
based on a dividing line between 4 and 5 days 
(3) | (12) 
G=7 | £F=37 
| 
(21) | (13) 
IT | 


s : E between 4 and 5 days, but there is now a somewhat greater possib a jo? 
overlap than appeared in Part I. As mentioned in $5 below the chance of misclas*! wort? 


E-6 
n * vision 0 
Tables 2 and 2a. Inspection of the marginal totals suggests that an appropriate divisi of | 
can be allowed for by the introduction of additional parameters, but seems hardly | 


Norman T. J. BALEY 325 


em i ; ` — 
poe on in the present investigation. Tables 3 and 4 give the amalgamated expected 
I served numbers for those parts of the data giving rise to w- and z-type distributions, 
T "ml of the chains for families with two and three cases are presented in 


Table 3. Amalgamated frequencies for w-distribution, with a 
total of K =A +H observations 


T 
Tane interv | 
Hane interval Observed no. Expected no. 
in days 
0 3 | 1:87 
1 "1 T e pes 
2 | 1 2-61 
3 2 1-64 Bie 
4 i * oss[ 9" 
25 0-58 
Total TER 11:00 


Table 4. Amalgamated frequencies for z-distribution, with a 
total of J = B4 2(E + F) observations 


T E alae | Observed no. Expected no. 
n B 
0 
1 à 
2 0-01 
3 8 0-04 4:32 
4 » 0-24 
5 3 1:01 
6 5 3-02 
7 B 6-64. 
8 6 10-97 
9 17 14-16 
7 ied 28-80 
i y m 31 13-80 
12 10 11:55 
13 10 8-87 
m 5 6-03 
18 1 3:43 
3 Toal ag 
16 1 5 0-53 5:66 
>18 - 
Total 9i—J dos 


326 Estimating the latent and infectious periods of measles. II 


Table 5. Analysis of chains for two or three cases 


| 
Type of | Type of - | S. 
mie M | at Observed no, Expected no. 
| 
| 
Single | (1) 6 8.62 
| (12) | 11 6-23 
(2) 6 9-81 
(12) 37 35-34 
"Total 60= M 60-00 
Double (2) 4 3-05 
(21) | 7 7-95 
Total lle 11-00 
l 


3. DERIVATION OF.SCORES AND INFORMATION FUNCTIONS 


For a detailed discussion of the mathematical model being used reference should be made t^ 
Part I. It is sufficient to repeat here that we assume the latent period, v, to be norma 
distributed with mean m and variance o°, and the ensuing infectious period to be of consti 
length a. Infection of further susceptibles during this time is taken to be a Poisson pre 
such that the chance of a given susceptible contracting the disease in time dt is Adt. a 
main aspects of the data that are faily easily analysed will now be described in detail din£ 
The w- and z-type distributions arising from the data are easily dealt with accor ytri- 
to the theory already described in Part I. The only difference being that there all a 5 
butions to scores and information functions were amalgamated (see (9) and (11) of eo 1 
whereas here itis more convenient to have the individual contributions displayed sept" 


w-type distribution 


ye . i na 
There are K families giving w-type observations, and the appropriate score and info 
tion functions are (1) 
S, = Ko3X(3Vo-?—1), (2) 
y= 2Ko*, 


where, as before, V is the observed second moment of the distribution about the orig!" 
z-type distribution 
Here we have J families in all yielding z-type observations, The four scores are 


S, = J(n—z-- A7 -cAc*-a(9^—1)3) gp 


3 —JA(e^ —1)y33.4. ys 
S al ) tT, (3) 
CaS TA TO 


S, mg 3L. 


Norman T. J. BATEY sm 


Where Zand 
v are the mean and variance 0: the obser ved distr jbution and is defin 
L f istri i Tpi d 
L 8 eda as 


in Part 
n. 4 and summed over the observed values of z. The corresponding information func- 
La= x T} —J(Meiro?-A7 tL atea(ga — 1), 
Ls M TT, - J(Q 20?) (^ — 1)21 Aaa (ga — 1), 
Tin = p T, T, — J(1 4 2*0?), 
h.-à T, T, -JX0, 
Taq = X qi -JANge-1y*, 
Tam = & TT, + TA = 1) (4) 
Tag = & T,T, +Jio(e—1 ja, 
Lm” ETa —JA, 
Luo = z TT,- JX0, 
Joe = x Tè -Jo 
M x Pete with doh gd 
ake the K families with a double introduction, i.e. A of type (2) and H of type 
t the Reed-Frost formulation (see Bailey, 1955, 886 


(21) 

. Usi 

and 7) sing the Greenwood rather tha 

inform; ce relative frequencies are € 
ation functions are therefore 


—Aa and 1—-e*. Contributions to the scores and 


S,- Hale- 1)3- Aa, S, = Aa Sy, (6) 
T Ly = Kae poA Ta taba. (6) 
here ; 
e is no special difficulty in developing corresponding formulae for the Reed—Frost 
the chance of infection must then be allowed for 


Variant j 
nt if required, though variation in 


ntroductions 
o account the possibility of variations in 


peel above, it is necessary to take into ac 

Only a sj a of infection when analysing the frequencies of the several chains starting with 

(Bailey B FO NM case. This has already been discussed at length in an earlier paper 

When 7 ). The chief assumption made was that the chance of cross-infection, p 

constant equals 1—e-^ in our t notation. followed a £- ay : es 
a wi 


resen 
Parameters v and y, namely P 
> 


Chains with single ? 


jen 


eapepyde QOs»s1. 
Bee apd (<P ) m 


328 Estimating the latent and infectious periods of measles. II 


: : s : ing the 
The scores and observed information functions, repeated here for convenience using 
present symbols for observed quantities, were shown to be 


g= BILIE BER [M M B+E ) | 

T a si] iy srytl'siyri| (8) 
s IB-CTE T] [AM QM "T B+E \ 

omg y*l]| \e+y x+y+l z4-942]' 
n _B+E+F EF " 

[ az? («+12 — D 
Thy = -W, i5 
| ie EE, TES W. 

y (y -- 1)* 


M M B+E 


2 y = C 
ca Gy @ty+le* wry 49 


Primes are used here to distinguish these auxiliary scores and information functions from 
those with which we are more immediately concerned, ; s. The 
This procedure entails using two parameters to specify the probability of infection: ü 
parameter A, which we are already using to score data that are relatively insensitlV t 
variations in the chance of infection, can be regarded as an. average probability. We we 
need to introduce one additional parameter. The average value of p for the distributio 

is p = x/(%+y), so that we can write 


v 
= ]—¢-Aa 
THY à 
ú (19) 
or “= y(era— 1). 


d y 
We can now suppose the expected frequencies to have been written in terms of A ^u a (10): 
instead of x and y. The usual processes of differentiation, using the functional i uire 
then lead to the following expressions for the scores and information functions as red 
in the present context, written in terms of those given in (8) and (9): 
11 
S, = a(z +y) S7, Sa = A(t y) Sh S, = £) Sa + Sy i 


Ta = @ety){(e+y) Ls 82), 
La = Aah, — (2 +y) Si, 


Ly ad a(z 4 y)y (xL, t Yliy— S84), (12) 
Toa = Mata, 
Loy = Aahy, 
ON og EN rr xg 
Lr = (^) It. x (^) Iz, yy 


Time-interval distribution for triple introductions 


d i 1 "M 
The last item to be discussed is the extraction of information from the G families 


volving a triple introduction, Each basic observation is the number-pair (u, v), where % a 
Let the times at which symptoms occur in the three patients, measuring from the com” 


Norman T. J. BAILEY 329 


Then these variables are normally and in- 


Point of infection as origin, be £,. £y and &. 
. The joint-frequency distribution 


ppoudontly distributed with mean m 4-a and variance g? 
of the ordered trio (Ei, E, £) is 


3! I 2 
FE «£ « &) dE, dedi, = i? -gaXü-n- ay dE d£, dE. (13) 
iE Ode 220?) ed t= 
If 
We now use the transformation 
qucm oe 
u = —§1, | 
= E,— bo; | (14) 
s = E TÉ Es 


E Write down the joint distribution of u, v and s, we can then integrate out s to give the 
“quired joint distribution of u and v only. This turns out to be 

f(u,v)dudv = A esp [—ghalo ++) dude (1e) 
We can now derive a score and information function for g in the usual way. If for a set of 
Observed number-pairs (tp v;) (È= b G), we put 
(16) 


We obtain I = 400. E 
ffected by variations in the chance of infection. 


of all the various components of the scores and 
maximum-likelihood estimates of the five 


a 5 x 5 information matrix which have 
à aly elements of the 5 x 5 inf 
meters A, a, m, g and y- The only 


n Š 3 and J,,, and these are easily seen to be identically 
Em mentioned explicitly are ro yer them according to the five main aspects 
à actual dat? h case, for trial values of the parameters, the appropriate 
i" ion functions. Corresponding contributions are then 
all scores multiplied by the complete information 


] values. The process is then repeated as usual 


The 
Se results are, moreover; not à 
his now completes the derivation 


Info; p x 1 inin 
formation functions required for obtaining 


bos, In analysing j 
3 9 Next, we calculate in e3% i 
Ontributions to scores and informa ion 
ñdded and the resultant vector 2 ^ pen 
Atrix to give first corrections to t » 
Until the desired accuracy İS ERN 


4. ILLUSTRATIVE EXAMPLE 


The "T mum -Jikelihood scores and irap — can now be applied 
to regoing maxi zo, the relevant aspects of which are exhibited in Tables 1-5. A pre- 
n Ope Simpson’s data, data shows that so far as the z-distribution in Table 4 goes a really 
“Shea ingpee a y to be ica p ox apparent excess of Observations 
n c a is wn balancing \e cit a ays. In Part Ia difficulty of this sort was 
Stings ays wit: m familie of two in the appearance of possible spurious peaks at 7 and 14 

5 untered Moers t to be due to an. unconscious bias associated with integral multiples 
= ^ This hig present case no very epic explanation can be discerned. However, 
sn m.s mia ween E omen n t seems in other respects to be quite 
Bund dried plan seems to be o pool the frequencies for 10 and 11 days when proceeding 
to test goodness? i 


330 Estimating the latent and infectious periods of measles. II 


Preliminary estimates for A, a, m and o were taken from the final values previously Hl 
obtained for families of two, while a trial value of y was available from an earlier analysis 
of data from Providence, Rhode Island (Bailey, 1953). After carrying out the standi 
procedure of maximum-likelihood scoring, the final estimates turned out to be 


^ 


A = 0-180 + 0-039, 


@ = 7-05 + 1-13 days, 18 
fh = 7-63 + 0-50 day, 

6 = 1-59 + 0-26 day, 

fj = 0-56 + 0-32, 


These estimates of A, a, m and o are rather less precise than those given in equation e 
of Part I for families of two, but it may be noted that in no case are the two estimate i 
any parameter significantly different. The parameter y is not determined with much Ra ii 
sion, although the estimate obtained does suggest an appreciable amount of variatio 
the chance of infection from family to family. + the 
In testing goodness-of-fit, based as in Part Ion the last set of estimates but one aa 
iteration, the classes bracketed together have been pooled to avoid small expecta n 
We have also amalgamated the frequencies for 10 and 11 days in the z-distributio tion 
indicated above. The total number of degrees of freedom is 13, i.e. 1 from the w-distribt 


f 146 


«0€ 
on 8D.r. As the 5 % point is at 15-5 we can regard the fit as reasonably satisfactory: wi 
of course for the anomalous behaviour of the frequencies in the 10- and 1 1-day € is 
z-distribution. Whether this is due to some bias in collecting the data, or ee uld 
genuine biological significance is a matter which requires further investigation, an 
be given special attention when new data of this type are collected. ting th? 

Some consideration should be given at this point to the consequences of negleo part b 
effect of variations in A on the form of the z-distribution. Using the method of § 5 1 rese 
it can be shown that the fairly substantial variation envisaged there would in the P 
case actually improve the fit very slightly, reducing x? by about 0-5. 


n 


5. EXTENSION TO LARGER FAMILIES 


arg” 
The general procedure described here for families of three can clearly be asian Mores 
families. By picking out contributions to w- and z-type distributions we could use t kin of 
and information functions given in (1)-(4) directly. When analysing the different 1 _(12) 
chain allowing for a variable chance of infection the formulae given above in ( ( 1959" 
would have to be applied to the extensions of (8) and (9) indicated in detail in Baroni ‘ops 
Double and triple introductions can also be dealt with as above. Other multiple intro to me d 
require the obvious extensions of (13)-(17). We have in the present paper neglected JexitY* 
use of the distribution of v-intervals in chains of type (21), because of undue comp oth?” 
With larger families there would be further relatively intractable items of this sort. En 
point of importance is that with small families the errors introduced by neglecting e ub i 
allowance for the probability of chains being misclassified are likely to be small. ate? 


ET ` Te X 
larger families this source of error would be much more pronounced because of the £'. 15 


3 ient 
opportunity for the distributions of different kinds of chains to overlap. If sufficien! 


Norman T. J. BALEY 331 


of this kind 1 : : 
E Pat 1 are forthcoming the difficulty could be tackled along the lines suggested in § 6 


Rees a pleasure to acknowledge my indebtedness to Di R. E. Hope Simpson, of the 
Pera r Public Health Laboratory Service, for making available to me the epidemio- 
des cords used for the illustrative example in $4, and to express my thanks to 
amara Hazlewood for undertaking the computations required in obtaining the 


numerical results. 
REFERENCES 


Bar, 
EY, N. T. J. (1953). The use of chain-binomials with a variable chance of infection for the analysis 


iometrika, 40, 279. 


of intra-household epidemics. B 
lems in the statistical analysis of epidemic data. J. R. Statist 


Bar; 
ze N. T. J. (1955). Some prob! 
Eoo B, 17, 35. 
Y, N. T. J. (1956). On estimating the latent'and infectious periods of measles. I. Families with 


t a 
Wo susceptibles only. Biometrika, 43, 1. 


[ 332 ] 


SIGNIFICANCE TESTS FOR A VARIABLE CHANCE OF INFECTION 
IN CHAIN-BINOMIAL THEORY 


By NORMAN T. J. BAILEY 
Design and Analysis of Scientific Experiment, 6 Keble Road, Oxford 


In their analysis of measles data for Providence, Rhode Island; Wilson, Bennett, Alesis 
Worcester (1939) found that, although Greenwood’s chain-binomial model (Gron cM 
1931) fitted satisfactorily the distribution of the total number of cases in families P 
size, this theory was inadequate when the data were analysed according to the dif uod 
types of chains involved. Greenwood (1949) suggested that this might be due to weal tr 
in the chance of infection, p, between different households. Subsequently it was aM en 
Bailey (1953) that good agreement between theory and observation could be obtaine 
the assumption that p varied according to the /-distribution 

dF = Ll p - gy-dp (O<p<l ü 

Bee.) eee 


wo 
N à , . . anne the Ü 
The appropriate scores and information functions were given for estimating 


parameters, x and y, in households of three and four susceptibles. ethod 
Now ifin fact p is constant, x and y are both infinite and the maximum -likelihood ™ ciety 
breaks down. Dr Armitage pointed this out in the discussion on my Royal Statistical Pi y th 
paper (Bailey, 1955), and suggested that one might find it preferable to work W "imu" 
reciprocals a’ = x! and y’ = y~. This is certainly more satisfactory, but the pm 
likelihood scoring technique again fails in the limit as x’ and y' tend to zero, since è pre 
contain terms proportional to 2'-! and j'71. We cannot derive an adequate signifie ing 
for variation in the chance of infection without making some assumption about in fectio™ 
form of the ratios y/« and y'[a'. It follows from (1) that the average chance of in: 
P, is given by (2) 


—1 
p=— = (1+2) i 
vy x 


One way out of the difficulty is therefore to use P as one parameter, and e 
The only drawback here is the algebraic complexity of the scores and informatio 
An alternative and much simpler formulation is obtained by writing 


ther. 
‘ory’ as the 9" A 
wv ory functio” 


(3) 


a 1 
EU. a xu 


p 


where, for convenience, we have dropped the bar from P, and p = iiie ilies of 
Table 1 gives the expected frequencies of different types of chains in fam! ative n 
for the Greenwood model, the modified model involving c and y, and the altern ] ded 


> ñ . inclu 
ofthelatter using p and z. Observations from the Providence measles data are also ! 
Scores for p and z are 
b+c+d abe etd aba 4) 
Sp = UP es =g pre q+z ( 


; n — 2(btc) ord, Ate, 
Denr 142z pct q+z 


Norman T. J. BAILEY 333 


while t} 
he "ved i i 
Observed information functions are 


b+c+d  acbcc cod a+b 


pp — 2 T > + 

T p la (pez? I” 

Em c+d _ a+b 2 
" (pz? Q+” (5) 


Im n Abo), cd a+b 

T (xz) (1+22)? (p*zP (qt2 

ularly suited to rapid computation, since 
olved in each line can be accumulated 


quares of denominators in the case of information 


5) are partic 
7 


Th 
€ expressi " 

pressions appearing in (4) and ( 
arious quotients inv 


for tri 
eA l values of p and z the v 
id on a calculating machine, the s 
ns being read directly from Barlow’s tables. 


P 
able 1. Expected and observed numbers of chains in families of three susceptibles 


ype - Expected Expected nos. on modified model on se 
Chain oes on MENO .  (—— is rovidence 
cenwood measles 
i. model In terms of « and y Tn terms of p and z nos, data 
Ed 
1 2 ny(y +1) ng(g 2) 
"T wry) era D TEES a 34 
12 ro - 2nzy(y +1) 2npq(q +2) 3 k 
A CETICETESUT Hut?) 42) 0-23) 25 
13 Bri 2na(v4- 1) V 2npq(p +2) 
ufq | ay @etyt trm ata (i+2) | ^ bs 
12 na(v+1) np(p +2) | 
np? Gr» @+yt) (1+2) | d 239 
A | — | 
Total n n " n 334 


to test for the existence of variability in the 


Suppose, however, we are merely concerned to t 
Se in of infection, i.e. to examine the hypothesis that z=0. Then we can avoid the full 
Oring procedure involved in estimating P and z jointly by basing the significance Yast'on 
E distribution of the score S; calculated at the values z=0 and p= o. where 9 is the 
aximum -likelihood estimate o p when2— 0. Thus 
b4 2c 2d 
Bo = 354 3b + 30+ 24 (6) 
a d a+b 
nd mi ann dai (7) 
is easily shown to be [L.— 15./1,,].-o. Using expected 


o sample variance of S, (fo. 0) 
tions we therefore have 
n(3— 1p) 


var S.(Po 0 = 14pd 


ihe larg 
formation func 


(8) 


334 Significance tests for a variable chance of infection 


š igni- | 
The data shown in Table 1 give 9 = 0-789 and S, = 172 + 23. This result is strongly Sig?! 1 


ficant, as expected from the earlier investigation (Bailey, 1953) ' uil 

Similar results are easily obtained for larger families. Table 2 gives the expectations E 
observations appropriate to families of four. The frequencies expected in terms of * E. 
have been omitted for reasons of space, but have been set out before in Table 4 of my 1 
paper. This time the scores for p and z are 


g 224 n—h,n—a—b etftgth n—-g—h at+bter+e b+c 


E , 


EIU q PHZ p qz q+2z q+3z 
S=- 2” _3(m-a-h) 4(b-cxekf) 5(c-re) | 0 
^  I+z 1422 14-3z l-4z 1452 


4 Brana | (efr g eh) &—g—h 2(a+b+c+e) 3(b+c) 


p+z p+2z q+z q+2z q-32' 
with information functions 


.^-a8 n—-h n—a—-b e+f+g+h ^n—g—h a+b+c+e bcc 


"OP P Uer pra emt (emp tere 
| ^—a—b 36*ftgth n—-g-h Aatb+e+e) 3(b +c) 
© (e+e (p+ 22)? (qz) (q+2z)® — (q 32)" (10) 
| 0 dim _ 9(n-a~h) 16(b+c+e+f) 25(c+e) 
= (1+2)? (2:3 ^ (x3 ` (1-42) ^ (14529 
42—2—b def gh) n-g-h 4(a-b-c-e), 9b+0) | 
(prey © (p42 +a t qra tg +8) 


sy 
: S . for €? 
Again we see that the Scores and information functions are in a form suitable 
computation. 


SES "CN i x lating 
The short significance test for variation in infectiousness is now given by calou 


S= (cd) +8e+f+g+h)  3a+e)+6(b+e)+(d+f) 


Po 1— ĝo 3h) a!) 
— (3a + 10b + 15c + 6d + 15e + 10f + 6g + 3/4); 39 
2 
ned o LiiMeeDexesfegen) : 
° 3a-5b-E 66+ 4d 3 6e - Bf 4 dg 3h ow 


H . H to i 7 
This time the expression for the variance of the score does not appear to ge as befor 
venient simple formula, but we can still avoid the full iterative procedure by using: ( 13) 

var S.(Dos 0) = ee — Del Ipp leno 
but where, for z= 0, we now substitute the observed information functions given by 


=P+2(c+d)+3(e+ftgth) | 3at+e)+4b+e)+ 2(d f) -g 


Ipp Be 1—,)? 1 
Tp = C*d)t3(-fgh) 3(a--0) 60 t 0) t (dj) a) 
pz = p (1— 29) : 
b+e)+(d4f 
+d) 45 +h) (ate) +140- ) 
LS (c+d)+ ee D (—9 
0 


—~{5(a+h) + 3005 -)  55(6- e) + 14/44. gy. 


Norman T. J. BAILEY 335 


I 
f the Reed. Frost or Lidwell variant of the chain-binomial model (see Bailey, 1953, 1955) 
fications of the above would be required. 


w : 
ere though necessary, certain obvious modi 
2 gives y = 0791 and S; = 157+ 41, 


p iieations of these formulae to the data in Table 

Th a highly significant result. 
i apes discussed above can a 
E theory (Bailey, 1956) in which account 
and infectiousness is not confined to an ins 


lso be applied to a recent development of chain- 
is taken of variations in the incubation 
tant but is allowed to persist for a fixed 


Table 2. Expected and observed numbers in families of four susceptibles 


ue Provid 
iss Observed ence 
A nos. on D d . on modified model Eo ies 
chain Greenwood Expected nos. on nah eed 
model 
| 
1 nq(g 4-2) (q-- 22) ? | s 
Ro (42) 14-22 | | 
3npq(q--2) (1-22 (g-32) : A 
E m ate) 029 (32 0+4) 
z 22) (q +32) | 
6npa(p +2) (a+72) G+ . A | : 
ia eae array (+ Be) (1-89 (42 +5) s 
3npqCp +2) (q+) 5 
_ nppl p ta 7 
ig Sgir Tre) (1+ 2) (1+ 32) 8 
z 2z) 
6npa(p +2) (p4-22) (g+2) (+ i : 
3 bi Tee) (heme) (o8) (1-4 (1-52) 
3npq(p +2) (P+ 22) (q+2) ; ' 
3e 3np*q* 242) 0-22 (14-32) (19-49) 
3npq(p +2) (p+2) Y " 
E Snp'q (2) (1422) 0-032) 
np(p+2) (329) à 4 
ae np (142) 0-22 
? n 100 
| ai i n 


four basic parameters. In families of two the data are 

S aH z n 

Insensit; <a infectiousness petween families, but with families of three, 
Deve e to variations in ppm nalysing that part of the data which yields chains of 
i bs. bod must Be aioli One ad tional parameter was therefore introduced—the 

S sho E able 1. * s - WEM ui j s 

Quantity call Me z o. Difficulties again arise if we wish i. CORE y =y- is signi- 
cantly diff S E Bn zero The solution is to use 2 for the H vw E | c ee as above. 
e tond alcoiltarg scores and information fn it B5 Tpm 15, and Iz, given 

Y (4) and (5) above with primes added, corresponding d he of the 1956 paper. The 

u * 3 
Slàtio gei eR a uo PS © scores and information 


derive the co 
functions for the parameters A, a and z. W 


i 
"erva] of time. This model use 


e thus have 


S = a8), &oME = 8 (15) 


336 Significance tests for a variable chance of infection 
and Qa = a?q(gI;,, +S) Ji = Aah, —48, 


Rs (16) 
D- = aqli,.. I3 a hy 
dig = Aa“, I. = des 


corresponding to (11) and (12) of the 1956 paper. e one 
estimates of the four parameters A, a, m and g, given z=0, and then test the significan first 
2 by comparing S, with its standard error. However, it is probably easier in proc 
to test z on the chained part of the data only, using expressions (6), (7) and (8) # o 
We then proceed to estimate all five parameters, using as a trial value of z, either Ze 


z data, 
the preliminary test is not significant, or the actual estimate from this part of the 
based on (4) and (5), if it is. 


ol hood 
We could find maximum-likeliho 


REFERENCES 
Barney, N. T. J. (1953). The use of chain-bino; 
analysis of intra-household epidemics. Biome 
Bary, N. T. J. (1955). Some problems in the 
Soc. B, 17, 35. ilies with 
Barrzv, N. T. J. (1956). On estimating the latent and infectious periods of measles. II. Fami 
three or more susceptibles. Biometrika, 43, 322. 
GREENWOOD, M. (1931). On the statistical meas 
GREENWOOD, M. (1949). The infectiousness of 
Witson, E. B., BENNETT, C., ALLEN, 
Providence, R.I., 1929-34 with resp 


phe 
sa faatior: £08 
mials with a variable chance of infection 

trika, 40, 279. R. Statist 
statistical analysis of epidemic data. J. 


36. 
ure of infectiousness. J. Hyg., Camb., 31, 3 

measles. Biometrika, 36, 1. Jet 
M. & Worcester, J, (1939). Measles and scarle 


fover a 
: A ] 
ect to age and size of family. Proc, Amer. Phil. S0c« 


| 


[ 337 ] 


ON THE VARIATION OF YIELD VARIANCE WITH PLOT SIZE 


By P. WHITTLE 


Applied Mathematics Laboratory, New Zealand Department of Scientific 
and Industrial Research, W ellington 


Tho 
een examined is that of evaluating the spatial covariance function of yield density, from a 
edge of the way yield variance varies with plot size and shape. Results are obtained in $3 for 


Several ki 
on kinds of plot. Results are also obtained ($4) on the dependence of the yield variance on plot 
etry for very small and very large plots. Special attention is paid to the case for which the 


Covari E 
lance follows a power law at large distances. 


1. INTRODUCTION 
explain the observed variation of yield variance with size 
to allow the possibility of correlation between yield den- 
e shall restrict ourselves to the stationary case, for 
spat; tant over the area.) Moreover, it appears that this 
lal correlation must often fall off relatively slowly with increasing distance between 
° two points; as a power function of the distance rather than as an exponential. 

1951 € same behaviour is shown by observations on yarn diameter, flood height (Feller, 
) ation samples. The calculations of this article will apply to 
Il continue to speak of plots and yields (although 


TN and response from popul 
Cases, too, but for concreteness we sha « (erie 
> instead of ‘plot’, indicating that we do not confine 


W 
im. Sometimes use the word ‘region 
es to two dimensions). E . n 
he type of calculation most usually made is dm evaluate the ge variance for a plot of 
“finite size and shape, and for 9 given spatial covariance function p(s). However, the 
Verse calculation Md probably be more useful in general: to determine the covariance 
ariance as & function of plot geometry. Such a pro- 


uction fi jeld v 
rom a knowledge of J! B : 
edure re Me make use of experimental results to obtain at least a partial 
“imate of p(s). We consider t js question in 83. : 
: plem is most easily reached by using a Mellin transform, so 


Soluti f pro A 
that ipii i s off as à power of the distance, 
~C ls” (s large), a) 


Tt i 

ad ma known that, in order to 

Sitios P of plot, it is necessary 

Which t any two points in the plot. (W 
he expected yield density is cons 


p 

Mak 
es j éput. s 
hens unfor FE. evidence to show that many observed covariance functions are of 
> is a goo! As Smith, 1938). This is all the more interesting, in view of the fact that 
m (1) (Fairlie of yield variation do not predict a power law, but rather amuch 


Simpler linear MO p Jaw of the type 


the 
the 


Ore qui :minishin 
quickly dimini? 
y ps) ~E |s|-*e-**! (s large). (2) 


We ._ question further in $4. 
Sh; «sue this d sns 1 | 
: x uy i Boos m Phe, of a large one-dimensional interval is 
i E TION ional to a power of the length of the interval if the covariance f 
à ems P. (1) but analogous results are not obtained quite so immedi TROPANG 
h iic of the E , This topic is also covered in $$3 and 4. mediately in two 
ore dime”! 
Biom. 43 


22 


338 Variation of yield variance with plot size 


2. FORMULAE FOR THE YIELD VARIANCE dera 
We shall use a vector co-ordinate x to denote points in the plane. Ten pi ae na 
or region, Q, with area A and yield y(Q). (We use the words ‘plane - - ; 
formulae of the paper apply to regions in n-dimensional space, n = 1, 2, 3, -». d 
Consider two regions Q, and Q, which contract to zero in such a way thatu ie 2 
constitute two points a distance s apart. We shall define the covariance functi 


lim CV) yQ] 9 


os A 


and shall restrict ourselves to cases for which this limit exists. uu 3) that 
If V(Q) is the population variance of y(Q), then it follows from definition ( 


T 
V(Q) = ALES dx,dx,. 


* > rom (4): 
If p(s) is continuous at the origin then we can draw one immediate conclusion fro iB 
V(Q)-p(0)4? (Q small), a 
— ; V is P" 
that is, for small regions (i.e. for regions which are small in alltheir dimensions), 5 spr 
portional to the square of the area. This is to be contrasted with the case of 7 


correlation, when V is proportional to area under all cireumstances. 
For the one-dimensional case, when Q consi 
(4) simplifies to 


the? 
a, 
sts of a segment of a line of length 
6) 
" ( 
Va) - 2 | (a—x) p(x) dz. 
0 T the 
: ralle: 
For the case when Q consists of a rectangle in a plane, with sides a, b par? 
co-ordinate axes, we have similarly ( 
b 
Vab) - a [ f a-a 6— o0. yaziy. 
0Jo0 wo 


p 
; other e 
The integral (6) may be evaluated for the commoner choices of p(s), but (7) and of ib 


ate forms 
dimensional formulae can generally be evaluated only for rather unlikely fan 
function p(s). 


If we introduce the spectral density function (8) 
F(w) = T eto:3 p(x)dx 


and the areal characteristic Junction 


(9) 
G(w) = I etm-Xdx, 
a 
then equation (4) can be rewritten as 


a) 

ij © 

VQ) = E. G(o) |? Fea) do. "P. 
mM 

This formula sometimes has advantages over (4), since the limits of integration 1^ po 


ý : : mate 
the weights given to different wave numbers w are quite evident, and approxim? 


P. WHITTLE 339 


su 
coim the method of steepest descents can be used immediately. The function G(w) is 
; y evaluated for simple plot shapes. Thus for the rectangle with sides a, b 
4sin (aw,) sin (bos) 
(11) 


üel-— du 
12 


and for T x T 
or a circle with radius a and centre at the origin 


Gea) = =" Jaw), a2) 


whi i i 
p I J, de the Hegel fonctiott of order 1. Here w, and o, are the components of w, and w 
Its absolute value. 


3. THE INVERSE PROBLEM 
it would be useful to be able to invert relation (4), and 


T 
tom the practical point of view. 
) in terms of the yield variance V(Q), which can pre- 


oe the covariance function p(S 

hed be observed for varying 2. 

"s inversion is very simple for the specia 
ectangular plots. We have respectively 


18V 
plo i os (13) 


] cases (5) and (6), corresponding to linear 


1# (v, y) 
p(z. y) = 4 ovy) * (14) 
e as these. 
to one general type of case. We shall assume that 
nly, and we may write 


Ho 
Wever, there are few cases as tractabl 


€ shall now largely confine ourselves 
hat p(s) is à function of s 0. 


e 3 

Covariance is isotropic, 80 t 
p(s) = p(s): (15) 
ith a plot of constant shape, which can only 


dealing W 
dimensions in the same ratio. For a given plot shape, 


y its largest dimension, x (say). and we shall 


(16) 


Fu 
E. ther, we shall assume that we are 
Y similarly, i.e. by changing all its 


© Size of the plot is conveniently specified b 
V(Q) = V (a). 


Tite 

for a rectangular plot 2 equals the diagonal, 
ill be a function K(s) describing the 
in the plot, such that 


als the diameter, 


ach plot shape there w: 
ints chosen randomly 


(Th 
ete Us, for a circular plot x equ 
WN Now, corresponding to e 
tibution of distances between two po 


1 
yu)- I, K(s) p(s) ds. 


he ratio x; 1, then the infinitesimal elements of 
all have quite generally 


(17) 


If 

w E $ On 

`T We now increase all linear dimensions m t 
tio &? : 1, and we sh 


gration will be increased in the ra 


Væ) = 2?" i KO) p(s) ds. dis 


22-2 


340 Variation of yield variance with plot size 


i Tellin transforms 
The form of the relation (18) between V and p invites us to introduce the Mellin tra 


B 19) 
Viv) = Í V(s) s"1ds, ( 

0 

:: (20) 
piv) =f p(s) s" ds, 

0 

i 00. (2) 
Rw = ['rt)stas, 

0 

in terms of which (18) becomes (22) 


V(v—2n) = K( —v)p(r). P. 
. , 9) and (+ 
We shall consider in $4 the common range of v for which the transforms ( 
exist, and (22) is valid. Assuming for the moment that a suitable range uw ws 
Vodusdisteby obtain a solution for p(s) by taking the inverse transform of the ext 
p provided by (22). In fact, if the desired inverse relation to (18) is 


" (23) 
p(x) = am | L(s) V (xs) ds, 
0 (24) 
han p(v) = L(1+2n—v) V(v — 2n), 
where L(v) has a similar definition to K (v). Comparing (22) and (24) we see that a 
26 
L(v) = — = 
— K(v-2ny able © 
In order to be able to carry out the calculation explicitly we must mn he case? 
perform the integration (21) and then invert the L(v) yielded by (25). One of eter, n=?) 
for which this is possible is the rather unrealistic one of a circular plot (v =diam 
For the circle we obtain by direct methods (26) 
K(s) = 3[s cos (s) — 8? J(1— s)], gn 
zo TORD 
whence K(v) = (y) T+ 5)" 


(8) 
Eh) = 4(v—3) P(3v + 1)} 
il 1) T'{3(v—2)} i po take” 
T ; i 0 
However, upon attempting to invert L(v), we obtain a divergent integral. saa put aA 
as an indication of the fact that p(x) should be expressed not only in terms 0 


sang, JETON e 
of its derivatives V9). In view of results (13) and (14) this is not surprising. 
we substitute the more general relation 


(29) 
p(x) = x-?» à » Lj(s) [(xs) V9(vs)] ds, 
$ 0 j-0 (30) 

we find that, provided that 
[PHA VONS) 20 (j—0,1,2,...), 
Lv) = £ w- 1) 0-2)... (v—j) Lj»). 


j 


(3!) 
then 


Now, the L(v) of (28) can be written 


mah uss 


t) 
2-9 )tw- 1)—(v—1) (v—2) + (v— 1) (v — 2) (v—3)], 


P. WHITTLE 341 


Which, in view of (29 r 
> 29) and the integral i : 
t the rélaiion ) gral representation of the Beta-function, corresponds 


ple) = C ("tas t) es Va) + s Y" Ta 
7 Jo A4(1— s) (38) 
3 E case of greater practical interest would be that of the rectangle. However, while K(s) 
asily calculated for this figure, the transform K (v) is not. 


i 4. POWER-LAW COVARIANCES 
E first return to relation (22). If a function f(s) is O(s~*) and O(s~) for small and large 
itive s respectively, then its Mellin transform f(v) will certainly exist for 


a «Re(v) « f (34) 


dx f) will in fact have simple poles at « and f. 
0(s) § p(s) is O(1) at the origin (if we suppose it continuous 
Sots, a large v, then piv) exists for 0 « Re (v) <A. For small 
E area of an n-dimensional sphere of radius s, and is hence O( 

n * onsequently K(v) exists for (17) < Re (v) < oo. 
hus follows that both quantities in the right-hand member of (22) 


there), and if we assume it is 
s, K(s) is proportional to the 
8-1), while for s> 1 it is 


exist for 


0 « Re (v) < min (n, A). (35) 


Sing ^ ; a = 
e relation (22) determines V in terms of K and p, the relation is also valid for these 


v (2 
alues of v, and V(v) exists for 


— 2n < Re (v) « min ( 2, ÀA — 2n), (36) 
an " 
d has simple poles at these extreme values of v. 
€ can thus conclude that for small plots 
Whi V(x) = O(a") = O(A*), (37) 
hile for large plots 
if V(x) = O(a”) = O(A), (38) 
i 
P(s) falls off more rapidly than s”, or 
(39) 


V(x) = Ola) = Q(A?-^^), 
(s) ~ Cs-"] which marks the 


if 
P(5) falls off as s- (A <n). In other words, it is a power law [p 
area for large plots, or 


Tan; DM i . 
in, "on between the two cases where V is proportional to plot 


“eases faster than plot area. 
( Pie (22) may be used in reverse: if V(x) is = 
à um 2), then we can deduce that p(s) falls off ass pi à | 
Si hould be emphasized that these conditions hold only for plots of cons ant shape, 


Nee p " r r 
illi i as wi rea. zever, some results pre- 
Sey V will in general be a function of shape as well as area Howe er, e results pre 

seem to indicate that this dependence on shape 


may by P. Fairfield Smith (1938) would a heron Ta eee 
hs © weak, and that V may be determined yP B. presumably 
only if the shape is not too extreme, e.g. not too narrow and elongated. l 
of airfield Smith's results are also of interest in that they provide very convincing evidence 
'€ behaviour associated with equation (39). He found that to a very good approximation 


observed to increase as A” for large areas 
:-2) for large distances. 


almost entirely 


342 Variation of yield variance with plot size 


the variance of yield per unit area in plots of area A could be represented as a curve 
const. A-0749, Replacing the index by — 3, we then have 


V(A)~ const. A2-? = const. At, (40) 


corresponding to a covariance function 
p(s) ~ const. s-3. (m 

(These equalities cannot, of course, hold for indefinitely small A and s.) . á 

This result, well founded in observation, provides evidence of three intriguing D 
bilities: (a) that covariances decaying as $^ do occur in nature; (b) that the rate of 
may be so small that yield variance increases faster than plot area; (c) that the observ 
index A may have simple rational values. pan 

None of the simple linear models hitherto proposed to represent yield variation ove E 
area (see Whittle, 1954; Heine, 1955) lead to covariances of the above type. For ini 1 
the model which relates yield density £(z, y) symmetrically to the yield at all points aro 


(x,y): j 
a, Y (EE E(x, y) — e(v, y) (3) 


leads to the covariance function 


p(s) = const. sK, (xs) (43) 
~ const. ste-** (xs large). 


(Here K, denotes a modified Bessel function.) 


For the one-dimensional problem of yarn diameter variation D. R. Cox (person? edi 
munication regarding unpublished work) has proposed a model which does in fact ver 
a power-law covariance for large s. The only multi-dimensional models known to the aU 
which prediet a power-law covariance for any range of s are those of turbulene? ex 
Batchelor, 1953, p. 122). For these models dimensional arguments show that the W® 
must have simple rational values. atio” 

It seems likely to the author that any model which is to provide a satisfactory explo! qure? 
of the power-law decay observed in agricultural work must embody two of the $° (b it 
common to Cox’s yarn model and the turbulence model: (a) it must be non-linear: 25 p tial 
must consider the variate (yield density) to be a function of time as well as of the 2 
co-ordinates. (In Cox’s case ‘time’ is discrete, the number of smoothings that the T1 ts? 
undergone.) As an example of such a model, one could suppose that fertility gras sopa 
the soil tend to be smoothed out in the course of time by a diffusion process, which 5 4 (o 


i R ten 
linear to the extent that only gradients which are greater than a certain value 


diminish. sh’ 


] com 


The value of the index A deserves some discussion. Is it fortuitous that Fairfield B iol 
estimated index b = 0-749 lies so near the rational value $ (corresponding to A= 9 male’ 
little can be inferred from this single instance. However, Fairfield Smith lists the? of tbe 
estimated from uniformity data for several different kinds of crop. The histogram NT 
thirty-nine estimated b values has its main peak in the interval b = 0:41-0:50, a? K at the 
peak and upper cut-off point at b = 0-71-0-80. This provides at least a suggestion 


P. WHITTLE 343 


values b = 4 and 3 (corresponding to A = 1, $) are in some way distinguished. Perhaps one 
Should say no more until further data or predictions from theory are available. 


The author is grateful to Dr D. R. Cox for drawing his attention to the observations on 


yarn diameter and flood height. 


REFERENCES 


Baronzron, G. K. (1953). The Theory of Homogeneous Turbulence. Cambridge University Press. 


FamrweLD Smrra, H. (1938). J. Agric. Sci. 28, 1-23. ; i 
ELLER, W, (1951). The asymptotic distribution of the range of independent random variables. Ann. 


Math. Statist. 22, 427-32. i i 
INE, V. (1955). Models for two-dimensional stationary stochastic processes. Biometrika, 42, 170-8. 


HITTLE, P, (1954). On stationary processes in the plane. Biometrika, 41, 434-49. 


[ 344 ] | 


ON THE CONSTRUCTION OF SIGNIFICANCE TESTS ON THE 
CIRCLE AND THE SPHERE 


Bv G. S. WATSON* AND E. J. WILLIAMS} 


1. INTRODUCTION 


; Se iy three 
A number of recent papers have dealt with the probability density, in two and 
i i ‘ional to 
dimensions, proportion exp (K cos), : 
: TE : ; : and the 
where « is a precision constant, and @ is the angle between an observed unit vector 5 5m 
population mean unit vector or polar vector, 'The purposes of these cian neue” or i$ 
been (i) to derive, from observed results, limits within which the unknown polar v 


in their 
likely to lie, and (ii) to test the homogeneity of different sets of observations, both 2 thus 


site 
. 2 x rom & 
For example, in palaeo-magnetism, a sample of rock specimens is collected fro 

and the direction of remanent magnetism of each S 


geological epochs, be 


Fisher (1953) considered the three-dimensional case where the observations ma i he 
regarded as points on a sphere. He derived the maximum-likelihood estimates of « fiduci: 
polar direction and provided the basic distribution theory which he used to find p» jfican? 
test of a prescribed polar direction when « is unknown. Watson (1956 a, b) gave a p.n direc 
table for the test of x= 0, and approximate tests for the equality of the x’s and po 
tions for any number of populations. ied the two” 

Gumbel, Greenwood & Durand (1953) and Greenwood & Durand (1955) studie it circle: 
dimensional case where the observations may be regarded as points on the pue ate O d 
They gave a table to facilitate the calculation of the maximum-likelihood o stributio® 
and a significance table for testing x—0. For this case, less is known of the listributio" 
theory. Pearson (1906), Kluyver (1906) and Rayleigh ( 1919) gave results on the dis 


m 
; ave 80 
of the length of the resultant vectorofa sample when x — 0. Greenwood & Durand g } 


` ised to 
In the present Paper, the fundamental property of sufficient statistics will t e two” 
derive tests, free of nuisance parameters, for direction and homogeneity =a best require 
and three-dimensional cases. In only one situation, however, can we give i and 9? 
exact distribution. Inequalities are suggested for some of the tests Ce 
arithmetical example suggests that these may be useful in practical applications. 


* The Australian National University, Canberra, A.C.T. T C.S.I.R.0., Melbourne. 


G. S. Watson AND E. J. WILLIAMS 345 


2. BasIC RESULTS 


A. 


We wri i 
9 write the density function as 

E G, (K) ex CoS 9, (1) 
a p the number of dimensions and C,(«) the corresponding constant factor. In general, 

range of ø is 0<0<7 
but i ; à 

In the two-dimensional case it is conventional (though not necessary) to take 

0x0 -«27. 


The constant factor is, in general, the re iotesLof 
da: 
2 mem 7 sin»-2 0 e* °° d0, ü 
whicl TE- I)o 
odd) i may be expressed in terms of the im 
In terms of sinh x and cosh x. When p= 
and O(K) = 1/(224(K)) 
Os(K) = x| (4m sinh x). 
is a function only of 


aginary Bessel function p(x) or (when p is 
2 the factor must be halved. Then 

(3) 
(4) 


L 
he Probability density of a sample of N 
N 
Y cosh; = X. 
H i=l 
ence, when the polar vector is assumed known, 50 that the 0; can be determined from the 
the maximum-likelihood (ML) equations 


o 
" eis c 
erved vectors, X is a sufficient statistic for K, 


Or K bein ; 
g "i K) X =2 . 5 
Li N (p=2) (5) 
i X 
and cothx- =e (P= 9): (6) 


known, its ML estimator is, for all values of p, the set of 
of the sample of N unit vectors; the ML equations 


dm the polar vector is not T 

ction cosi ector resultar l 

rom (5) idi * Ege cases p— 2 0" 3) with X replaced by R, the length of the resultant. 
and (6) (in th sen the resultant and polar vectors, then X = Re; R is 


C is th i t labem i : 
iene ae Mii the polar vector is known. From the form of the distribution 
is Seen e ^w n sufficient ior 5s pena s x Jonit Ppr ye x apr Mai 
en KAO, is fi c 2 py multiplying the joint probability density, when «= 0, by the factor 
, is found by AGI Maa 
miform distribution given by x=0. When x—0, R and c 


= therefore first conside? m j 
le; pvc ed. 
Independently diste ensity of R may be expressed as an integral (Kluyver, 1906) 


D two dimensions: s ] 
R j [DE] Jo( Rt) tdt, 
0 (7) 


Pearson, 1906). These representations are discussed. by Green 


es 3 e 
n this case, the joint density of R and c is given by 


Or A j 
a serit P 
S an asymptotic Hence, Í 


Wood & Durand (1959): F n : 1 
e R| ONR Qt) ra) dc 
mor gen aize (8) 


< ula (3:6) of Greenwood & Durand 
Which i as form" f 
h is the same 


346 Significance tests on the circle and the sphere 


In three dimensions, Fisher gives the joint density 


eu) ete. qo RaR (0) 
(saan) eRe 6s (R) Rd Rade, 
1 a N E. e | 
m $49 = may 5 (3 (~1)*{N— R —25)v-2, | 


with the notation (x) = z if z2 0, (x) = 0 if x<0. : y-Rin 
It follows from these results, or can be derived directly, that the density of X= 
two dimensions is ex a (10) 
mE CO AN 
zem, [J4(0)]N cos Xtdt, 
and in three dimensions is 


8-0 


K N eX o IN N a!) 
(ar Wi, (Oax aye E. 
For confidence limitson «, when the polar directions are known, and for tests of à po. » 
polar vector, the probability densities (10) and (11) suffice, since in either case t 
hypothesis specifies c so that X is determinable. 
To test the homogeneity of the values of K 
vectors, the same value of x 


In such cases the joint distri 


olat 

for several sets of results, or to compet i 
being assumed, we need to work with the sample ER is 
bution of the sample resultants and their overall rest en; th 
required. We suppose that there are samples of sizes Ni Na... with resultants © 
R,, Ro, ... and that the overall resultant has length R. sno, We gi" 

For resultants in two dimensions, the joint distribution is not easy to qe A the 
here the results only for two samples. If the integral in (8) is denoted by VU? 
joint density of R,, cı, Ry and c, from two independent samples is 


exp (k(F e, + Tic,)) Ry Vy, (By) Ry yrs (Ro) 
Molk) Ns 7((1—d)(i-d) ` 

We put Ryo, + Rsc = Re. Then the joint distribution of R,, Rọ, R and c may be 
the following simple geometrical argument. 3 d 

Since Re is sufficient for K, the factor introducing x may be ignored in the : 
and brought back into the final result. Thus x may be assumed zero, so that the 
uniformly distributed independently of the R’s. 

R? ranges from (2, — R,)? to (£4 - Ry, being given by 


R? = (It, — R,)? cos? A + (R, + R3 sin? A, iate 


:« nniformly distr! 
where A is half the angle between the two vectors. Also, since A is d poe 
from 0 to 7, we may determine the conditional distribution of R given Ry, fts. 


ut RAR = 2R, R,sin 2AdA, 


e 


(12) 


found 5 


eriv atio” 
ngles are 


while sin 2A = {{(R, + R,)? — R?] [R2 — (R,— Re)? }}4/(2R, Tis), 
so that PIR = {(( 2, + R3)? — R2] [R2 — (R4 — Ro)? }}4 da. 
Hence the elementary conditional probability is 

dÀ RdR 


T (Gn Ry - BP] FP = (Ry - Rs 
—G(R,R,R)dR (say) 


— GQ. S. Watson anp E. J. WILLIAMS 
obability i 
ability density of R,, Ra R and c (still with x — 0) is therefore 


R Yy, (Ra) Ro (Bo) HBr Ra B) 
Bringing in the f: T | n 
e factor introducing xK, We find the non-null joint density to be (N — EN, 
=ZN,) 
exRe Ry) Ry (Eo) G(R,, Ro, R) 
c se | ra 


Finall " 
y, 1r rati 
ntegrating out c, we have for the joint density of 4, E, and R 
Ty(kR) 
BACIN RY, (E) Ry.) G(R,, Ra, B). (15) 


The q 
ensi ; 
sity of R is found, by integrating (8 


(KR 
d Bvt 


Hen 
ce EM 
the conditional density of R, and Ra given R, is 
Ry Ym 0) Be Yy, (Bo) Ra Re» R) 
Ryry(2) (17) 
nsity provides the basis for exact signi 
alize this result to more than two à 
1953) has shown that, for any MEA £ 
‘J T O 


) over all values of c, as 


(16) 


this conditional der 
ble to gener 
ons, Fisher ( 


Bei ^ 
ng j 
8 independent of «, 


test 
s. : 
So far it has not been poss! 


Or n 
m oem in three dimensi 
8, the joint-probability density of the Ri and Bis 
Nasi 
m 2sinb KÈ r5. (R) 
Whi sinh € K Sb Pb 
A (18) 
e density of £t is 
aig sinh kh 4 R 
2 sinh «, K wth), (19) 
S0 i i 
that the conditional density of the R; given R, is 
nds GO/GSCO- 
(20) 


he basis for any exact significance tests 


Thi 
his ss ss 
8 conditional density 1 t 
3. TESTS OF SIGNIFICANCE 
Th , : 
e requi ME tests for data following circular or i 
uired significan?? ar or spherical distri 
istribution: 
s have 


een discu i 
ssed in $ l- 
ibi make use of the fact that 2x i 
Kislarge We ma, 2K(1 — cos 0) is distri 

88 2 with p—1 dogrece of freedom. Hence, we have appr oe Mi ii E approximately 
2k(N — X) = Xt»-os 
E E = ype: ) 

( ) XP ty (21) 


— SR; and XH,— i 

re take N i ;— R to be inde 

i) (q— 1) degre cape disti 2 

)(g— 1) degrees of freedom respectively, es ic d ER 
d he number of 


We may therefo 
d(2— 


(p—1)(N —g) 8? 
Samples. 


348 Significance tests on the circle and the sphere 


The tests based on these results were discussed for the spherical case by Watson (19564) 
They will certainly be accurate when K is large, but seem to be accurate also for a wide "A 
of x and NV. As is usual in problems of this kind, it is hard to assess the degree of appo 
tion of (21). In $4 some more results will be given to justify the tests derived from (21). 

To test a given value of x or to derive confidence limits for K, we should use 


2k(N — X), when the polar vector is known, or 


2k(N — R), when the polar vector is unknown. 


are 
Likewise, to test the homogeneity of K-values for different samples, we should comp 
corresponding values of N — X or N — R. lizing 
To test the equality of polar vectors for different samples, we should use, genera 
Watson (19564), W-4) (ER, R) (22) 
(q—-1)(N-XR)* 
which is distributed as F with (p — 1) (q— 1) and (p—1)(N —q) degrees of freedom. jeten 
The distributions of all the test statistics given above carry « as a nuisance X E. 
The tests given below do not, but the required distributions, with one exception, " 
known. However, some inequalities are given which may help in their application. 


(a) Tests of hypotheses concerning k's 

It has been seen in $2 that when 

statistic for x. To test that several populations with known polar vectors have à f e join 
value of x, suppose samples of N,, N,, ... with values X 1: Xs, ... are available. Then th 

density of X,, X, ..., given the value X of X 


jent 
. a is a suffic 
the polar vector is known, X = X cos( is à ommo? 


i may be written as 


93) 
Ay OG) A)... 
Ax) 


, f any 
where Ay(X) stands for the factor of (10) or (11) independent of x. The emp? hsc 
test function found from the density (23) will be independent of x. However, as th 
a known polar vector is of no practical interest, it will not be pursued further. julation? 

When the polar vectors are unknown, the test of the equality of x in several P itant # 
must be based on the resultants R,, Ry, ... of the several samples, and the overall oni of & 
Formulae (17) or (20) show that the joint density of R,, Ry, ... given R is indepen the test 
so that an exact test can be made. When x is large, or when the N, are all wor T 
function must be sensitive to deviations of the ratios R,/N, from equality. t guitabl? 
conditions, the tests based on (21) are likely to be adequate. Nothing is known 0 
tests when both « and the N; are small. 


(b) Tests of hypotheses concerning polar vectors tor may 
(i) A single prescribed polar vector. If x is known, a test of a prescribed polar bes is! ot 
be made most conveniently by using the density of X ((10) or Vaa TM gite densi 
known. To derive an analogue to the single-sample Student t-test, we consider the peat f 
of R given X. Since the density of X is an even function of X, these are strictly pat 
a prescribed axis. In practice, however, the observations will usually be so groupe 
no confusion will arise, 


G. S. Watson anv E. J. WILLIAMS 349 


Tn two dimensions, this is found from (7) and (10) to be 


4 NIE) tdt 
J0 " 


[ ^ LU (D) cos Xtdt (f — X*) 

J0 

e R2 X. If the sample mean directio 
"ch greater than X. Thus to define a possible significance tes 


n is far from the prescribed direction, R will be 
t, we find a value of R, Ry 


Say, so that P(R> R| £) = % 
le, SO t 7 t0! Ñ : X ‘ 
o that ln p an [RHI J (RO) tdt = af [H(t] eos Xtdt. (25) 
quoi A i 


This is difficult to solve for Rọ. Since, however, the integral of (24) from X to N is unity, 


Wi 
e have the relation 


N R E: " 
i ; JOP Jo( Rt) tdt 
j" U (0) cos Rott = Lus g^ INDE 


uh PM 
of era |, JN AG) 


Si f A 
nce R> R,> X. Thus, in two dimensions, 
"AN cos Rot dt 


X)< depu (26) 
piis S S | * Ust) cos Xtdt 
0 


26) could, with some difficulty, be evaluated 
approximately; be replaced by their saddle- 
hen R and X are large, i.e. near N. 


T , 
B functions on the right-hand side of ( 
“merically for various A, X and JV or, more | ww 

“mt approximations or by their appen is 

Tn three dimensions, we have instead of (24) 
pee ee e 
ppe maa 
a 


WI) 


aA 


80 that, 1, : 
à ]culation, j 
y an easy ca s zi (-1* (QN - BO 95) N1 
d Ee 8 . (28) 
P(R> Bo|*) = E) (-1} N -X -287 
NS 
ve the same form, i.e. th i 
AM . " f (26) and (28) have t , i.e. the ratio 
E Will.be:seen thatthe right-hand oan la Furthermore, the ratio of the leading terms 
Of the density of X evaluated at Ban Nl 
in (28), N-R >a (Fae) ; 
( 2-1) 'N- Re, 


bability of a cosine less than c, caleulated fiducially. 


is Fi m roximation to the pro 
Fisher’s first @PP OUO P aken separately 


n computing (28), this factor is best t 
Working figure? 


in order to reduce the number of 


350 Significance tests on the circle and the sphere 


(ii) Comparison of several polar vectors. It will often be required to test the equality d | 
the polar vectors of several populations. To devise a possible test statistic, we observ 
that, if the sample resultants have very similar orientations, Ry +R, +... will not be Ke 
much greater than R, while if the orientations are not alike R,+R,+... will be much pe 
than R. Since the density of Ta ESL given Ris free of K, it provides a possible exact E 

In the case of two populations in two dimensions, the joint density of R, and Ra give 
R, is given by (17), where the region of joint variation of R,, R, and R is defined by 


OSE;&N, O«R,«N, O«R«N, (29) 
| £,-R,| s R&« B, R,. j 

We require a value R, such that 
P(R,+R,> R, | 2) 2 «. 


: ined 
As with equation (25), it seems difficult to solve for Fi, but an inequality is easily obtam 
by the method used to derive (26). Thus 


(30) 


P(R, +R, > Ry | R) < Pos) y 
Tj) i 
For application of (31), the function Yy must be found. Use may be made of the fact 
V'(R) is asymptotically 2 exp(— R?/N)/N. 
An alternative test is given by the function 


2) 
2) i+ f, — P] — (R, — py d 
-1)(N-— 2 

N(N —1)(N —2) (QN? RE) [N*R? Qe — gy] B. 
which is distributed approximately as y? with 1 degree of freedom. This tests equali T 


oppositeness of direction; that is, departures of R either from R, +R, or from | £5 — give 

For several samples in three dimensions, the joint density of R}, Ry, ... given #8 
by (20). Once again, only an inequality is easily found. This is that ) 
(33 


P(ER,> Ry | R) < $89. 
$x(R) 5 case 
Here the function $y(R) is easy to compute in any application. Actually there dhe 3 
where (33) is an equality. For example, with two samples only, the domain of ae i; 
given Ris (29). A little consideration shows that there will be equality in (33) ifR2™ 
or if Ry+ R> max (2N;). 


4. NUMERICAL APPLICATION AND COMPARISON OF SIGNIFICANCE TESTS 


ive 
In this section, we apply some of the tests suggested here to the numerical examples Jd 
by Fisher (1953) and Watson (1956a) and compare the results with those obtaine app)’ 
papers. The agreement is remarkable, and, since Watson’s tests are the easiest ve ratio 
it is suggested that they may be used with confidence despite their approximate deri one of 
To begin with we consider Fisher’s example (a). It is required to find the 5% 7 sb 
confidence for the polar direction. A sample of N —9 is available and R =8-77203- en f 
uses his approximate form for the probability that the cosine of the angle betwe 
resultant and the polar vector is less than c, 
N-—RyMNA (34) 
Fs (ss) 


G. S. Watson AND E. J. WILLIAMS 351 


With P — 0-05 

sin pana X He finds c — 0-98820 so that the confidence zone includes only directions less 

Eel "à ay from the resultant direction. We have shown that Fisher's solution is in 

ln His du rst approximation to our exact expression (28). Here, in fact, they incide 

close to N. So Fisher's solution is als i i 

A s is also that obtained from (28). ? <i 
ate result in this case is that Ou TA 

R(1—c) 

N-R 

Incidentally, the estimate of x from these data 


(35) 


& Fs ota» 


(N -1) 


W. 1 . 
DR 5 equivalent to Fisher's result (34). 
Es ies example (b), à sample of 45 more widely dispersed (£— 7:3) directions is 
B iOs01 6 Since R=38-9946, Fisher finds that the 5% zone of confidence is limited by 
C 93:6; fis 8-45?, by his formula given above. Taking F, ss (596) — 3:10, formula 
n. eos identical value of c. The extra terms in (28) and also the extra terms in Fisher's 
lion (34) 2 icated formula for P make no difference to the value of c. Thus the approxima- 
Al is'entirely satisfactory for small values.of « and medium values of N. 
ess to say, had the various formulae been used for a test of significance of a given 


polar dirant: 

T direction on the above data, similar agreements would have been found. 

o eg gives an example of a test that the polar directions of three populations are 
ical. In the data analysed 


Sample 1 
Ni 10 
R; 6-990 


Sample 2 Sample 3 
11 15 
8-212 12-194 
whi EN: wee 
ap ile for the combined sample N = 36; R=26-902. In this situation, Watson proposed the 
Proximation (N-3)ER,— R 
T Daaa og NER) (36) 
he ri octal -— 
p" right-hand side is found to be 1:81. Graphical interpolation in the F-tables show that 
B corresponds to à probability of about 13 %. The inequalty (33) of this paper may be 
Plied to this problem by taking 


Ry = DR, = 27396, B= 26-902. 
Noy E 
E Ny /N By 29N 
ice x 

Quo) 'N — Ro) ^ s=0 5] N.N — Ho y: 
oy(R) (oR g cos z 

‘i s)\ N-R / 

= 0:1499 x 1:0034 


= 0:1504. 

Si vind : 

a ce 0-15 is only slightly greater than 0-13, it seems likely that very little is lost in forming 

a 9 inequality (33) (i.e. it is close to equality) in practice and that the F-approximation is 
Pan accurate. 

P d we notice in this case that 

ermore, the value of ox (9)! oy 


On, : z 
© might suggest the simple approximation 
N- Rì? 

PER: > REIS (cx) j en 


the values of N and ĝ in each sample are both small. 
(R) is accurately given by the leading term so that 


352 Significance tests on the circle and the sphere 


Assuming that the approximations (21) are adequate for testing purposes, the relation (37) 
has the circular analogues 


> ivi 
P(R> R| X) (5) en 
F 4-1 
add P(ER;» R| R)x e =) : e9 


The merits of these approximations are unknown. 


REFERENCES 


Fisuer, R. A. (1953). Dispersion on a sphere. Proc. Roy. Soc. A, 217, 295. ho sum of 
GREENWOOD, J. A. & Duranp, D. (1955). The distribution of length and components of the 
random unit vectors. Ann. Math. Statist. 26, 233. pe cod " ory and 
GUMBEL, E. J., GREENWOOD, J. A. & Dunaxp, D. (1953). The circular normal distribution: the 
tables. J. Amer. Statist. Ass. 48, 131. ‘ g in rocks 
Invrxo, E. & Watson, G. S. (1956). The use of statistics in studies of the magnetie directions 
(unpublished). 
Kuvvyver, J. C. (1906). A local probability problem. Proc. Acad. Sci. Amst. 8, 341. 3. 
Pearson, K. (1906). A mathematical theory of random migration. Drap. Co. Res. Mem. ai hts in 
Strurr, J. W. (LORD RAYLEIGH) (1919). On the problem of random vibrations and random flig 
one, two and three dimensions. Phil. Mag. (6), 37, 321. Geophys 
Watson, G. S. (19562). Analysis of dispersion on a sphere. Mon. Not. R. Astr. Soc. 
Suppl. 7, 153. hys. 
Watson, G. S. (19565). 


A test for randomness of directions. Mon. Not. R. Astr. Soc. Geop 
Suppl. 7, 160. 


[ 353 ] 


NOTES ON BIAS IN ESTIMATION 


Bx M. H. QUENOUILLE 


Research "Techniques Unit, London School of Economics and. Political Science 


n 
- One of the commonest problems in statisties is, given a series of observations 
d., X, to find a function of these, £, (2;, 2s. -++ Tn)» which should provide an estim: 
of an iade : i i 
m unknown parameter 0. 
>: desirable properties of estimation procedures have been discussed fully elsewhere. 
are: i 
a 
Ru That the estimator should be efficient according to some definition of efficiency pre- 
dba y arranged. Most commonly, the reciprocal of the variance of the estimates is taken 
"Y Measure of its efficiency, as this is most useful where central limit theory may be 
elevant, í 
E (b) That the estimator should utilize all the information contained in the observations 
; dia +++) €, concerning the parameter @. This is not always possible, but, if such an estimator 
ists. it; 1 
E 1t is called sufficient. i . 
) That the estimator should be consistent, 1.0. I, converges in some probabilistic sense 


o 0 
» usually lim t, — 0. 


(d) That iss : b bi e 

h stimator should e un jas 
whens method of maximum likelihood is popu 
is id, by evaluating E(t,). 
sary is obvious when it is reme 


d, i.e. Elta) = 9. 

lar in that it satisfies properties (a) to (c), 
an unbiased statistic may be derived. That such evaluation 
mbered that y/(t,) is, by the same theory, the estimator 
y e vp), it will be the exception rather than the 
tor to be unbiased. 

mply evaluated no real difficulty arises. However, 


lios 4 : 
vided the exceptions may be $t i 
resents a major drawback and some simple approach 


B H 
i the complexity of the evaluation p 
en desirable. 


en in random order, the estimator t, may often be written 


he = t. kos en Ey); 


2, 
If the observations are tak 


., Ks Then, provided that 


P J 
in hi ks, ..., ht, are unbiased estimates ofthe cumulants ky, Ko: + 
pn 
(a) mis independent of n, 
i i i nsion, 
(b) the function i, is capable € Taylorian expat m 
(c) allof the cumulants are finite, 
it (d) tnis consistent, i.e. 0 = imt mm 
follows that ‘ 
at Sih. — K) (ks —K; dhs 
t,—0 = X(k;— oe), a 4 SD(h;— Ki) (k ais), 
i) kiki 


are power series in 1/n, it follows that Z(t, — 0), 


ing 
Š e : . 
le the moments of the estimators, ki; e 
Ne bias j i i ries in 1/m. 
» © bias in ¢,,, is also expressible as à power series 1 l 
3 


Biom. 43 


354 Notes on bias in estimation 


: . t 
The conditions (I) are undoubtedly more stringent than they need be. For instance, E 
higher cumulants need not exist. Further, it appears likely that I (b) is a necessary Sia 
dition if I (d) is to hold. However, the main point is that for a wide variety of statistics 
is true that "C 


E(t, — 0) = tai *aSU se 


3. Tf this is so, and t/, = nt, — (n — 1)t, ,, then 
E()-0—--3—27 3. 


and hence t/ is biased to order 1 [n? only. Similarly, t = [nt — (n — 1t; 4]/[m? — (0— 4 | 

is biased to order 1 /n*, and so on. obser 
Alternatively, it is possible to use the statistics calculated from any subset of be we 

vations to achieve corrections for bias. A further approach of particular sateen g 

when 2 = 2p. Here, we may use t3, = 2t,, —t, as being free from bias to order 1/ is d that 
Procedures such as these may supply approximate corrections for bias pos cessat | 

efficiency of estimation is not lost in the process. To achieve this in general, ls -— of | 

to use £, ,, the average of estimates from all possible sets of  — 1 observations, E « little, 

f,-1, and similarly £, , instead of 15-2, ete. With this provision, it appears likely tha 

if any, loss of efficiency will result, 


e 
y ; i rocedur 
4. Forinstance, many of the statistics t, may be derived from an estimation P 
of form 23 
DG (z;,t,,) = 0. 
i=1 


nique 
The variance of t, to the first order may be estimated from this equation by & ees 1949 
such as has been described by Weatherburn (1952, pp. 130etseq.) and Ken s follow 
pp. 208et seq.). In the simplest instance it is possible to represent the argument & 
If p = E(v,), then " 
nôt, = H(0) DL 
where 


H(0) = Hl Ap, e| ILE, An, e| ; 


if both expectations exist. (Ifthey do not exist, then generally the basic equation 
to one of the form 


is changed 


nòt, = Y SH (2,6), 
i=1 


to which a similar argument may be applied.) 


Thus [gor 
n 


vart, = varz, 
(n—1)6t,_, = H(0) > ôx; 
ij 
= PR 
(n—1)0L, ,— H(0)= Y, Dea, 
Nj=li+j 
n—-1 2 


= AD PE 


M. H. QuENOUILLE 355 


- Hence St! = nàt, — (n—1)98, , 
H9) 2 
AE E» 
and var = FON arn = vari, 
to order ljn. 


This implies that this correction affects the standard error by a factor of l/n at most, i.e. 
S.E. of tp = (S.E. of t,) {1 +0(1/n)}. 


Since, in general, the standard error of t, will decrease with n (usually proportional to n-3), 

i E tection will affect the dispersion of the distribution by o(1/n) (usually, O(n-3), 

"y a small amount in comparison with the bias). The reduction in bias achieved by using 

i is consequently not obtained at the expense of a comparable increase in the dispersion 
the distribution of the estimator. 


gi n 
iM As a first illustration, suppose 0 = c? for a normal distribution, and t, = Y (v, — z)? jn. 
en i=1 


tn = nt, —(n—1)0, 4 


Auges 
n XX (v; 25) 4—1 2 EX (wx 
= i<j x i<j 
E n? n kz (n — 1 


1 n—-2 
-XXGCa at 


= -—— Ex (c; —2)* 


n(n— 1) ^5; 
TL n 2 
=— X ind 
and n—li-i 
n 2 
S vari, = G5 = i) vart, 
ios 
Dülny if t,, = 2t)—E,, then 
, 1 zZ 2 
tap = 25—1 X (2—8) 
anq 4p — l i=1 
2p Y 2 
$n vart,, = ——. 
var ly, a i) 2? = 25 —1 


E ; 
a, c'ernatively, i, is calculated from only one pair of possible sets of p observations, say 


x, 
» and z,,, to Tap, then v" 1*2 
o-ga plie) 
and 2pi-i p'Mo / \i=p+ 
+1 
var tap = e ER 
Thus a > s fici 
oh, Sveraging over only one pair of the possible sets results in a decrease in the efficiency 


“Amati 
M Gp-)(p*l) , P=) 
2p? 2p? 


356 Notes on bias in estimation 


6. As a numerical illustration, suppose that it is desired to estimate 0 = 1/ from 2 
series of observations taken from a normal distribution. Then 


i, =n] > x; 
ici 
and ; n? (n—1)9 * 1 
LET X 
v "n i-i 2, vj 
21 9; jei 


vari, e varti ~ —.. 
n n nj 


s ^ tri bution 
The values in column 1 of Table 1 were random observations from a normal distribut 


f s a me istic: 
with x = 2, 9? = 1 (using the first ten random numbers in Fisher & Yates's (1953) Statisti 
Tables). 

Table 1 
EA | 18:32— x; ta-1ı = 9/(18:.32—2,) 
0-18 18-14 0-4961 
4-00 14-32 0-6285 . 
1-04 17-28 0-5208 
0-85 17:47 0-5152 
2:14 16-18 0-5562 
1:01 17-31 0-5199 
3-01 15-31 0-5879 
2.33 15-99 0-5629 
1-57 16-75 0-5373 
2-19 16-13 0-5580 
18-32 164-88 = 9 x 18-32 5-4828 
Then ta = 1/1:832 = 0-54585, 


t, = 5:4585 — 9 x 0-54828 = 0-5240. 
Here, owing to the high value of s? (= 1-4), this latter value has been correcte 
it would be using the exact formula 


n 2 fe 2 " 
Elta) = exp (-25) | exp (S ; du (see appendix) 


d more than 


0 
abet for 1 
ss +... for large y. 
This might be compared with 
ye 
n—1 n n(r;—z) n?(x;— T)? " Xe 
t = t T = 
9p" Ry, 7| * wept esa]. where f 


; es n?s? 

aim n—1) o» 
z n mns 1] 8 
a 


M. H. QUENOUILLE 357 


3 ES instance it might be noted that the procedure will probably break down if n4/2/o? 
all. This should be apparent from the behaviour of the f, ,, which will vary in sign. 


Boo next an inverse sampling procedure. Suppose the proportion, 7, of indi- 

individual, wie given characteristic is to be estimated, and sampling continues until r 

with the characteristic are observed. Let n be the total number of individuals. 

m E in e": then since the last individual is constrained tohave the characteristic, there 
— 1 values of ¢,,_, to be considered, and 


r—-1 f |r 


z 1 
T = r à 
Thus al n=l [e 1) n-l* Bon n—1 (n—1) 


" r nr—2r+1 r-l 
ty n= —(n IRIS «A 


Whig . 
h actually is strictly unbiased. Alternatively, if 1/7 is to be estimated, then, = /r, and 


» l n—l n—l n 
La ue 1) ir 5 r ]-5 


Th 

ust — n AN , ; 

Thi NTC ^, = n[r, which again is strictly unbiased. 

sampli Indicates that the procedure may be useful in sequential estimation or inverse 
ing 


D is Possible to use simple extensions of this procedure to correct the bias in any 
“ation of statistics, f (tas tn). The statistic 
mmf 

I by Mey) — (m — 1) mf Uys tn) — no — Df Mya) + (8 = V) (M= D faa) 


Te ; : ) 
Xample, unbiased to order 1/n? or 1/m?, whichever is the greater. 
sees Yns are used in 


1g, fo: 


the inn t, both the bias and the efficiency will depend upon these observations, and 
© à simple correction of the above type will not be possible. 


+ Ani 
™Mportant exception occurs when 
E (II) 
zal NE 
La thie s) 
S ‘stance, if 
r 2 = 2 (a7. 
ang E= BEEP 055) = OP rarely Hye) 
m-l_, 1 
hen, ipy; PM eal 
n= nt, — (n — 1) t, 4, where 
= bri [. 1 | 
i ia = È zy 06,3) 


^ lg 2 
* Unbj " 2 - 
à the „sed to order 1/n?, and var t, = Var t, to order 1/n*. This formula may thus be put 


ernati 
Dative forms ; 5 OY) , 
=, o(t,,-1) et 
PY n) 
= nl, -x(1 7 EU) bya 


358 Notes on bias in estimation 


In a similar manner, if 


ve yp (elev ee jl 


re 


n—? = 
n—2 T= Ls 


S U) a OM) 
and tn = (n—1)tn— (n— 2) bni: 
t 1 
where i1 =| 5 | / [= - | 
= | o*(L, .,) 0*(t, 4) 
is unbiased to order 1/n?. dia 
It is also possible to split 2p observations into groups of p. Thus, if equation (ID) hole 
1 1 1 
2 2 dig , 
(top) | a*(L, 1) o*(t,,2) 
T z t 1 
tn = 9h,—L5, wk Ex. —— s 
and 2p an —ty, where $, ( m m I ( = i) 


is unbiased to order 1/»? and has equal asymptotic efficiency to tons 


"IMP 4 à "Y „om the 
If, however, a correction is made for the mean, some loss in efficiency will arise fro 


e 
difference between the values of (y) for the two groups. If these are equal, the ae: : 7 
approach may be used with no extra loss in efficiency, and, since there is no order pe 
involved in statistics of this type, approximate equality of these values may m. o 
achieved by appropriate selection. For instance, if the $(y;) are ordered and the um. 
observations corresponding to alternate values are used in the two groups, then n). 
Eó(y;) and Eó*(y;) will be approximately equal for the two groups and tp = 2s — Mit» D 
will frequently be a sufficiently accurate and unbiased statistic. 

None of these results is, however, of much practical importance in that the : that . 
calculated directly using V/"(£) 2*(t,)| 2p (E). Their interest lies in that they indicato ses 
the same corrections applied to time-series statistics may be adequate for many purP 
This has already been suggested elsewhere (Quenouille, 1949). ial 00° 

i ; 


10. Consider first the application of these corrections in the estimation of à exists: 
variance, say the pth. Here, if the mean is known, say O, an unbiased estimator e? 

tn = (9 8,44 my, io t. TE, y,)l(n— p). 
with variance g^/(n — p) in the case where no correlation exists in the series, v 


g*A|(n.—p) otherwise, where A depends upon the correlations between the prod 
this expression. 


bias may p. 


i „pato 
There exist also n estimators based upon n— 1 observations. Denoting the KP 
which omits the ith observation by £, ,; this has n—p—1 terms if à <p OF d ) of 
and n—p—2 terms otherwise. The variance of t,1,; is correspondingly et[m— P p ) 
c*|(n — p — 2) when no correlation exists in the series, and c^ A,[(n — p — 1) or o3Ai|(” d Qu 
if correlation exists. Here, A, will differ for different i, but for large n it will appro* 

to A for all i. 


" 
: : : n gl 
If there is no serial correlation or if we ignore the differences in A,, the analysis ther n 


e 7 Posi) / Po) 


= (n—2) (n—p)t,][n(n— p — 2) + 2p] = tm 
DS 


n* 


M. H. QUENOUILLE 359 


Thus the correction does not affect the estimate. 

Ifthe variation in A, is taken into account, a slightly different estimate is obtained. This 
is still unbiased, but is less efficient (though asymptotically of equal efficiency) probably 
as a consequence of the individual estimators not beihg fully efficient. For example, the 
Products in t, are correlated with one another, and hence the end products contain informa- 
tion on Ly 5412544; etc. The end-products should thus receive slightly greater weight for 
efficient estimation. Similarly, some of the products in ¢,,_; ; should receive greater weight. 
; ith these provisions, it appears likely that the above procedure would not lead to any loss 
In efficiency, i.e. the slight loss in efficiency (asymptotically zero) which occurs results from 
the use of inefficient estimates. This obviously is of no practical importance. 

The effectiveness of corrections of this type in the general case is more difficult to prove. 

eir effectiveness might be demonstrated by considering the correction for the mean in 


estimation of serial covariance. 
f 


/n—p n—p 
n-p (= ti) | È up 
1 X Yitip Mol i=1 
=) OD = |, 


n bn n-p i=1 n—p 
_ then 


n 


» When the x; are uncorrelated, 


Ba) =~ Gap pol 


The ©xpectations of the t. , will vary. For instance, there will be a few terms of the type 
7—. 


oS) 
D Tif | 22 Vien &—2p-1 , C? 1 
ma T = Wea h+) 


E(t, 13) — (n—p-1y* = es de 
and 
= n—p 
ss n (s S wo) e? 1«o[ l J (p+1) 
Et, iQ E cmm E = E if m Dr 1), 


b - : 
Mt the majority (n— 4p out of n) will be of the form 


n—p S. —— 
z( E Li — Emp Um 2 isp m Tmp 
i=1 = 


E( ee) EE (n—p—2) 
a ee [«o( 3 
T ~(n—p-2) n—1 n 
hus 
. v* hio(l)| and Ze) = o(;.) 
Et...) = — zl TU E ii MW 
"^ zequivog, 


qu oir results will hold for the serial correlation coefficients. It is, however, an open 
Wane On as to whether the extra computation involved in calculating and using 7,_, is 
that ante comparéd with that involved in calculating and using tg,). It appears likely 


this € use of the two half-series is sufficiently accurate for most practical purposes though 


Ont requires further investigation. 


360 Notes on bias in estimation 


REFERENCES 


Fisuer, R. A. & Yates, F. (1953). Statistical Tables for Biological, Agricultural and Medical Research. 
Edinburgh: Oliver and Boyd. 

KENDALL, M. G. (1943). The Advanced Theory of Statistics, 1. London: Griffin and Co. 

QUENOUILLE, M. H. (1949). J. R. Statist. Soc. B, 11, 68-84. 

WEATHERBURN, C. E. (1952). Mathematical Statistics. Cambridge University Press. 


APPENDIX 
Proof of a formula in § 6 


w g n(r—up,. 
m= [^ iz mm 25 je: 


-2f en?! ky —a)*} dy, 


where 
a=pJn[o,y = x njo. 
e 1 
Let Ia) -f y Jr) exp{—4(y—a)} dy, 
oy 
then = 
exp (3a?) I(a) =f 7 nen exp (— iy? + ay) dy, 
Ž lexp Qa?) f(a)] = ib E exp (— 4y? + ay) dy 
= exp (4a?). 
Thus ppt 


exp (Ja) L(a) = 1 “wn ebd 


the limits being determined by the fact that I (a) = 0 when a = 0. 
Therefore 


I(a) = exp Cen [exp (3a?) da 


2 
Elta) = 2: exp (— 25) [; exp (£ :) de 


and 


[ 361 ] 


AN INTRODUCTION TO SOME NON-PARAMETRIC 
GENERALIZATIONS OF ANALYSIS OF VARIANCE 
AND MULTIVARIATE ANALYSIS* 


Bv S. N. ROY axb 8. K. MITRA 
Institute of Statistics, University of North Carolina 


It is clear that a p-variate body of data arranged in a q-way classification will formally look like a 
(p+ 9)-dimensional contingency table, but a distinction can be made between a ‘variate’ and a ‘way 
of classification’ in that along the direction of a ‘variate’ the marginal frequencies are supposed to be 
Stochastic variates while along a ‘way of classification’ the marginal frequencies are supposed to be 
fixed, When, along certain directions, the marginals are fixed, an approach based on a conditional 
Probability argument has been used. V 

In the present paper (i) the conditional probability approach is abandoned and we start either from 
a singlo multinomial distribution or a product of an appropriate number of different multinomial 
distributions according as, with multi-way frequency data, all ways are ‘variates’? or some are 
variates’ and some are ‘ways of classification’. (ii) Also the hypotheses that are posed are of different 
kinds altogether according as we have a ‘multivariate analysis’ situation or an ‘analysis of variance’ 
Situation. The hypotheses that are meaningful for one situation would not be too meaningful for the 
Other and vico versa. Since the conditional probability approach is altogether abandoned, the mathe- 
matical theorems to which appeal is made are the two theorems as stated and proved by Cramér 
(194 6, chapter 30) and a number of other such theorems which have been proved the same way and 
Which, between them, take care of all the hypotheses discussed in this paper. When all ways are 

variates’ the hypotheses are analogous to those in the usual multivariate analysis, and when some 
Ways are ‘variates’ and some are ‘ways of classification’ the hypotheses are analogous to those in the 
analysis of variance. 

The general methods discussed in this paper arose out of an attempt to analyse a large mass of 
Categorical data. The analysis has been carried out, and a few typical cases (together with the numerical 
analysis) illustrating different parts of the theoretical development will be presented ina subsequent. 
Paper. In this paper only the large-sample tests are considered. How ‘large’ the sample size has to be for 

ne validity of the uso of these asymptotic techniques, or in other words, some results on the nature 
the approximation involved will also be discussed in a later paper. 


l. INTRODUCTION 


Cra, 
mér 
5 Mér’g chapter on y? (1946, chapter 30 


“pplication of the earlier ideas of Barnard and Pear: (tof c 
th tivariate analysis, starting from a single multinomial distribution and framing hypo- 
E analysis situations, and (ii) to analysis of variance, 


eS a s E 
Stan. "Ppropriate to multivariate : pb Mei 
E Ing from the product of an appropriate number of separate multinomial distributions 
Taming hypotheses appropriate to analysis of variance situations. 
Thi ited S ir Force, through the Office of Scientific 
Rosca, i5 research was supported in part by the United aan Air Force, through the Offic 
of the Air Research and Development Command. 


362 Non-parametric generalizations 


The authors (presumably for the same reasons as Barnard and Pearson) abandon the 
conditional probability approach for a physical and a mathematical reason. The physical 
reason is that keeping the marginals fixed in the sense of conditional probability is seldom 
experimentally realizable, and although the authors are not dogmatic about this point, 


they would prefer to keep their probability statements as close as possible to experimental 
., 7) be 
) of 


realizable processes. The mathematical reason is the following. Let n; (i—1,2,.. 1 

the observed frequency in the ith cell and p; be the probability (under any hypothesis) 

getting an observation in the ith cell, and let the observations be independentin probati ay 

Let X n; = be supposed to be fixed, p;>0 and Xp; = 1. Then the unconditional likel- 

i i 

hood function is n! [T p?i/T] n,!. Next let us denote by n’ the row vector (nı, eal) and 
i i , 

let A bean s x r (s <7) matrix of constants of rank, say t< s, and let us suppose that the nS 

are subject to the constraints An = a*, a constant vector. Then the conditional likelihoo 

function will be given by 

(1:1) 


! 
$7 qua ot], E cras on 


ale : 4a=a II 7! i 
i 


Assume, for simplicity of discussion, that the p;'s are completely specified by the hypothesi 


(although this makes no essential difference to the argument). Is £ (n; —n21l (npi) A 
i-1 à 
large samples, distributed as y? with degrees of freedom (r—1)—1, Bee matter what A might 


be, provided that it is of rank ¢? If the answer to this question is yes, then the customary 
linear estimation or testing of linear hypotheses (in the sense of least squares and analys! 
of variance) can be carried out, starting from (1-1) and eventually using the y?-test. One ? 
the authors (Mitra) has constructed an A matrix such that under a $ (of 1-1) with that 
matrix, the limiting distribution of Y; (n;—np,)?|(np,) is not a x2-distribution. This, 

i 


course, will not affect the validity of the applications, mostly simple, actually made over p 
last fifty years starting from (1-1), because, in all these applications, A is such that we he 
the y?-distribution. But it would be mathematically unsafe to try to set up a gener 
of testing of linear hypotheses, starting from (1:1), to say nothing of its being P? 
unsatisfactory. We do not know under what restrictions on A the distribution W: 
but the authors have shown that at least those A's are permissible under which, inste? n 
starting from (1-1), we could also have abandoned the conditional probability approach my 
started from the product of a number of different multinomial distributions. Theinvalida®™ 
example will be discussed in a later paper. 


al theory 
hysic? y 
be X^ 


2. PROBLEMS IN A TWO-WAY TABLE 


To fix our ideas, consider first a two-way, say rxs, table with observed freque 
in the (ij)th cell (;i— 1,2, 57; j7 1,2, ...,8). Also let 


ncie8 Tj 


Xn Nos, xus —"4 and —mny-n (say). 
ij 


2-1. Both ‘i’ and ‘7’ are ‘variates’ j 


, 08 
Assume that we have a sample of n independent observations such that P? P to 
the probability of an observation in the (ij)th cell, and n is fixed from sa™P 


S. N. Roy anD S. K. MITRA 363 


sample. Also let E Piy = Pio E Py =Po X Py = Poo = 1. Then the likelihood function 
m. j i ij 
18 given by 


$=" ip. (231) 


The composite hypothesis we shall be interested in testing is that ‘i’ and ‘j?’ are independent, 

that is, Ay: Pij = Dio Do; against H + Hy, where the Pips and p,js are arbitrary positive 

nuisance parameters subject to X pio = X po; = 1. This is the analogue of the hypothesis 
i j 


of no correlation in a bivariate normal population. Under Hy we shall have the likelihood 
function Žo given by 
ni 


! r 
se TI (Pio Pos)" = -; IE ps T por (214) 
07 Tl myles IIo! j 
ij tJ 


ij i 


2. TM » r " “> LO E 
92. Bisa ‘way of classification’ and 'j' is a ‘variate 


A Assume that we have r independent sets of sizes 219, 7159, «++ nro of independent observa- 
tons such that n o (1 1, 2, ..., 7) is fixed from sample to sample and p,; is the probability 
i i Mida ; sa $ $ 
of an observation in the (ij)th cell. Also we notice that Y pi; = Pio = 1. Then the likelihood 
function is given by : 
Mio! TI v 9.9 
= g| py. (23) 
? I i Nij Vj E: 
J 
The composite hypothesis we shall be interested in testing is that Pij, for any j, is independent 
© Y T orin other words, Hy: Pij = do (Sy) against H + Hp, where thego,’s are arbitrary positive 
l Nuisance param eterssubjectto X, qo; = Spy = Pio = 1- This is the analogue of the hypothesis 
j j 


j A n , 

9f the equality of means for r homoscedastie univariate normal populations. Under H, we 
sh 
| dall haye TI na! 


Nii ! ny) — .* Noj 2-2.1 
$o = II | e a TI n! Ma; ; ( ) 


i Lus! j 
F 


This Žo could also have been obtained from the ¢ of (2:1), by putting Ty: Py = PioPoj and 
then finding under H, the conditional probability subject to thez;,'s being fixed. But it seems 
* the authors that physically this is far less realistic than the model here used, although 


EF Orically this is more or less what has been used so far. A P 
© case of ‘i’ being a ‘variate’ and ‘j’ a ‘way of classification’ is exactly similar and 


ed not be separately considered. 


| 23. Both <j? and ‘j’ are ‘ways of classification’ i 
S Hore we have a sampling scheme in which the 7,9’s and He's (121,9, 97:3 = l, 2, 2,8) are 
"IPposeq to be fixed from sample to sample. In this situation, on the hypothesis ofindep end- 
“dee etween ‘i’ and ‘ j’, we can write down the likelihood function ¢, without assuming 
at the observations a independent. For this we start from an urn problem model in 


i n m f r different colours from which we 
l "ch there is an urn containing nyo n2 +++» o balls o 


E Successively without replacement 2j, Rog» «++ los balls (with Z nio = nw =n). 


364 | Non-parametric generalizations 


The joint probability that the jth set of no; balls will contain Nij nj ..., n; balls of different 
colours (with j= 1,2, ..., s) will be given by 


Qo = II nio! ITa!/(! II ni!) (23) 
i j ij 


The great advantage of this scheme is that the different observations need not be assumed 
to be independent, and the great disadvantage is that it is not clear what is the form of f 
undera general H as distinct from the null hypothesis H, of independence between *$^and ] - 
This means that the power of a test for H, against alternatives cannot be obtained and alin 
that is is not possible to obtain a one-tailed y*-test for Mọ using the same kind of heuristio 
arguments that we shall use in the first two situations. A one-tailed y?-test can be found here 
by analogy with what is done in the first two cases. 

This ¢, could also have been obtained from the ¢ of (2-1), by putting Hy: Pij = DioPoj 
then finding under H, the conditional probability subject tothe nip sand noj S being fixed. 
this would be less realistic than the model here used and would deprive (2:3) of the one g” 
advantage it possesses in that the successive observations do not have to be independent- 
Notice that (2-1) is based on the observations being all independent. ll 

Tt will be seen that the approach here is not one of conditional probability at all. It w 
also be seen that there are three different sampling schemes each leading in a natural ec 
to a particular probability model and a particular type of hypothesis to be tested. d 
physical standpoint it would not be proper to break this tie and use a particular probabili - 
model and test a particular type of hypothesis when the sampling scheme is something 
different. It will be noticed that in most situations of life the natural sampling schemes p 
those of (i) or (ii), but there are situations, e.g. Fisher's tea-tasting experiments or et 
connected with the extra-sensory perception experiments or with the claims of astrolog® 
as to prediction, etc., where (iii) might be a natural sampling scheme. 


and 
put 
eat 


3. PROBLEMS IN A THREE-WAY TABLE 


: ved 
As a natural extension of a two-way table consider a three-way r x s x t table with obse! 


frequencies n;;, in the (ijk)th cell (i—1,2,...,7;j— 1,2, ...,5; k= 1,2, ..., t). Alsó let 
Yu Tego Mage = Mior = Nig = ups Magn = "ook 
4 t) 
B Nir = nojo EMi = 899 E Mir = Nooo = % (say) 
i,k jk i,j,k 
9. 527, 55! and ‘k’ all ‘variates’ 0 
; pro" 
Assume that we have a sample of n independent observations such that pir P? p let 
bability of an observation in the (ijk)th cell and n is fixed from sample to sample. 


È Pijn = Pojeo B Pijk = Dior Xp = Pijo È Pijr = Poor 
j lc ij 


È Pine = Dojo: à Pur = Pio» (2 Pir = Poo = l- 
: r » rk i,j,k 
The likelihood function will be given by 


n! (pn 
———À4 nik, 
AI a! ioe 

7 . üjk tact on? 
In this case we shall be interested in testing a class of composite hypotheses, 2 typic® 
being the composite 


$- 


S. N. Roy and S. K. MITRA 365 


3-1a. Hypothesis of conditional independence between ‘i’? and ‘j’ for fixed * k^ 


This will be 
à Hy: Pie = Pink Polk or pig, = Pio Posk (3:141) 
Pook Poor Pook Pook — 
against H+ H, (i=1, -ri j= l 8; Em lees. 
! This is the analogue of the hypothesis of no partial correlation between x and y. given z 
| in a three-variate normal population. It is easy to see that if we superimpose on De the 


c i ess » é 
omposite hypothesis of independence between ` and ‘k’, and between ‘j’ and ‘k’, ie. 

3 Pior = PiooPook and Poje = Pojo Poor: (3:1-2) 
which is : ; 

hich is the analogue of the hypothesis of no total correlation between x and z, and between 


Ya : * . 
yandz, in a three-variate normal population, we should have 


Pijk = PiooPojoPook: (3-1-3) 


whi 1 INE 6h? 
E is the condition of complete independence of ‘i’, ‘j’ and ‘k’. 
e shall also be interested in another class of composite hypotheses, the typical one 


emg the composite 


$. 
lb. Hypothesis of independence between ‘(i,j)’ and ‘k’ 


This will be 


Hy Dg = Pijo Pook against H + Hp ü=1, 05 j=l, o8; k=1,:,t). (914) 


a ka E is the analogue of the hypothesis of no multiple correlation between (x,y) and z in 
n 9e-variate normal population. 
Sep E easy to check by summing the two s! 
Parately that (3-1-4) implies the composite hypotheses 


des of the above equation over i and also over j 


Pior = PiooPoor and — poji = PojoPoor- (3:15) 
i But (3-1-5) will not imply (3-1-4). It has been shown by Roy & Kastenbaum (1956) that 


Y H 
1S necessary to superimpose on (3-1-5) the additional hypothesis 


| to obta; H: Pije dijoiorTojr 
| ih tain (3-1-4), where gijos Zion» do; 918 defined to be arbitrary positive functions of (2,7). 
>”) and (j, k) with no summation convention connecting them as in the case of the Pijp S- 

7 analogy with analysis of variance H, will be called the hypothesis of ‘no interaction’. 


(3-1-6) 


s 
Pb 


49 B B > 
A V and ‘j’ are ‘variates’ and ‘W a ‘way of classification 
Ssume that we have ¢ independent sets of sizes noor «++» Zoot of independent observations 


Su ; 3T: 

ich that Moon (= 1, 2) is fixed from sample to sample and Pijs 18 the probability of an 

Servation in the (ijk)th cell. Notice that X pijn = Poor = l- The likelihood function will 
ij 


e given by 

Noor! - 

e-n[ pea p|; (32) 
k LAL Mijn: ii 
H 13 
3 ere we shall be interested in testing the composite 
3a "m im : 
* Hypothesis of independence between ‘i’ and ‘j’ for each ‘k’, that is, 

Hy: Pijn = PiokPojk against Bath, W= atid ahh k=1,...,t). (9:231) 


366 Non-parametric generalizations 


If we superimpose on this the composite hypothesis that the marginal ‘i’ (obtained by 
summing over ‘j’) is independent of ‘%’ and similarly for ‘j’, that is, 


2-2 
Dior = fioo (Say) and Pojk = Tojo (SAY), (à ) 
(3:23) 


we should have 

Pijk = dioodojo- i 

Notice from (3-2-2) that X qio = E Pior = Poor = 1 and also that © gojo = € Poje = Pook 7 7 
i i j j 


We shall also be interested in the composite 


3:2b. Hypothesis that p;;, is independent of ‘k’, that is, 


344 
Ay: pus = lijo (say) against H+H, (for all i, j and k). (3 ) 


This is the analogue of the hypothesis of the equality of ! mean vectors (each — 
of two components) for ¢ bivariate normal populations, each having the same variant 
covariance matrix. 

Summing over ‘j’ and ‘i’ separately this would imply 

9:5 
Dior = È dijo = fio (say) and Pojk = È lijo = dojo (say). (4 
t 
As in the case where ‘i’, ‘j’ and ‘k’ are all ‘variates’, so also here, (3-2-4) implies (um 
but (3-2-5) does not imply (3-2-4). Exactly in the same way as shown byRoy & Kasten WU 
(1956) it can be shown that the extra condition which, when superimposed on (3:2:5), 7 
imply (3-2-4) is (3:26) 
Piir = dijoTojiiok- 


93. 'i'isa ‘variate’ and ‘J and ‘k’ are ‘ways of classification’ 


. ; uch 
Assume that we have s x t independent sets of sizes nj, of independent observations 8" 


that noy, (3—1,...,5; k=1, «+. t) is fixed from sample to sample and p;;; is the proba 


of an observation in the (ijk)th cell. Notice that E Pije = pos = 1. The likelihood fun 
will be given by i 


ctio? 


; (3°3) 
$ = [ Rogie! z sm 
E T D I Dij 
Here we shall be interested in the composite 
3:34. Hypothesis that for any ‘k’, pis, is independent of ‘j’, that is, 1) 
93 
Hs Pi doy (say) against H+H, (for all i, j and E). 
Notice that Xd = X Pign = Poz = 1. 
We shall be also interested in the other composite 
3:3b. Hypothesis that for any ‘ĵ’, Pije 18 independent of ‘k’, that is, 3.2) 
39 
Hy Pie = gijo (say), against H+H (for all i, j and k). 3:2) 
. i 3 
Notice that x dijo = X Pist = Pojy = 1. We now observe that (3-3-1) together with ( 
implies that p;,, is a pure function of ‘i’, i.e. that 3) 
33" 
Pisk = Coo (Say) (for all i, j and k). 


d S. N. Roy anp S. K. Mrrra 367 


If, in a one-way classification in the usual analysis of variance, ‘i’ corresponds to the 
‘variate’, *j? to the ‘concomitant variate’ and ‘k’ to the ‘way of classification’, then it will 
be seen on a little reflexion that (3-2-1) will be the analogue of the hypothesis of no regression 
and (3-2-4) will be the analogue of the hypothesis of no covariance. On the other hand, 
Suppose we take ‘j’ and ‘k’ as just two ‘ways of classification’, for example, if we take ‘7’ 
as, say, blocks and ‘k’ as, say, treatments in a randomized block experiment (with more 
than one and in general unequal number of replications in each cell). Then (3:3-1) will be 
the analogue of ‘no block effect’ for each treatment separately and (3-3-2) will be the 
analogue of ‘no treatment effect’ for each block separately. In other words, in the usual 
Parlance of analysis of variance, (3-3-1) combines the hypothesis of ‘no main effect’ and 
Eno interaction’, while (3-3-2) combines the hypotheses of another ‘no main effect’ and ‘no 

‘nteraction’. In the analysis of variance situations for data which are not of the ‘normal 
Variate? type, the authors believe that this would be a better way of handling the material 
than the one used by Roy & Kastenbaum. Even for ‘normal variate’ type of data in the 
analysis of variance situations the authors are not sure that from the physical standpoint 

is might not be a better approach than the customary one of analysis into main effects 
and interactions of various orders. It is hoped to consider this in a later paper. 


94. i isa variate and ‘j’ and ‘k’ are ‘ways of classification’ in the sense of a ‘balanced 
incomplete’ or ‘partially balanced incomplete’ or a more general type of ‘incomplete’ block 
experiment 

Assume as before that there are r ‘i’’s, s ‘j’’s and t ‘h’’s. Assume further that ‘j’ is a 
lo ck and ‘k’ a treatment and that, for any ‘j’, there is a set of treatments (T); associated 

With it, of number tj. In other words, for a given j, k takes on the set of values (¢);, where 

s 

(t), is aset of indicesof number tj out of 1, 2, ...,¢. Now assume that wehave = t; independent 

de 

Sets of sizes "tj, of independent observations such that nj (ke (D); j= 1,2, -...8) is fixed 

d Sample to sample and p;;; is the probability of an observation in the (ijk)th cell. As 

"fore E Pir = po, = 1. The likelihood function will be given by 
t 


jy! I ze] (3-4) 
- prik|. 
u ke (Os room jo 
t 


p can take over the hypothesis (3-3-1) of ‘no block effect for each treatment separately 

d (3-3-2) of ‘no treatment effect for each block separately’. For a ‘balanced. incomplete 

«ign? all the t/s will be equal and there will be a highly symmetrical pattern while for a 

aly balanced. design’ all the ¢;’s will be equal but there will be a less symmetrical 
rn, 


Si s of classification’ in the sense that 


-Tisa ‘way of classification’ and ‘j, k’ also are ‘way: 
the nas and Nojn's are fixed from sample to sample 
$ 9 can write down ¢, in this case in exactly the same way as we wrote down the ¢, in 
On the hypothesis of independence between ‘i’ and ‘(j,k)’, this will be 


po = IL! II nog! [(* TT Tag). (3:5) 
i jk tik 


Sis . T es 
tarting from this we can test the hypothesis of independence between ‘i’ and ‘j, k’. 


368 Non-parametric generalizations 


The case of ‘i’, ‘j’ and ‘i’ being ‘ways of classification’ in the sense of the 2,99 S: Rojo 8 and 
Noo (but not the 2,5s) being fixed from sample to sample is also of some interest, but 
we shall not consider that case in the present paper. 

The extension of the problems of the two-way tables of $2 to those of the three-way 
tables of $3 is a rather big conceptual jump, but the extension from three-way tables to 
those of higher dimensions involves no such jump and will not be discussed in this paper 
except for some remarks towards the end. 


4. THE DERIVATION OF THE X? TEST ON THE UNION-INTERSECTION 
PRINCIPLE i 
: ' : s ally 
Let a random sample of size n from some population be classified into k (<n) ae 
exclusive and exhaustive categories according to some observable characteristics (qua. ith 
tive or quantitative), and let the probability of a random observation falling in the 
" k n ith 
category be p; with p,>0and Y, p;= 1. Let n; denote the observed frequency in fhe 
i=1 J* 
category with, of course, Y»; =n. Also let n’ = (74, 7g, ..., nj) and p' = (pu Pa? Pk 
2 


(4) 


We have now 


bay? n! 
P[n [P] = Fa Mer 
A t° * 


41. A simple hypothesis Hy: p' = p, against the composite alternative H: p' + Po ifi 
Consider first the most powerful test at a level say Py, of Hy: p' = po against à pea 
pi po. which, by the Neyman-Pearson lemma, will be as follows: 
Reject H, if e. m (prn 
P[n' | p;]/P[n' | p] 2 « 


u 
: r po 
and accept H, otherwise, where, given ju, the size of the critical region (4:L:1) unde 


E coal ides: 
should be plu, Do; Pj, 2). Substituting in (4-1-1) from (4) and taking logarithms on both s! 
we see after a little simplification that (4-1-1) becomes 
a'(pı) (n —p,) log j| — na'(p,) p. (4-1-2) 
; 2-9. P1) Po , ; 
v{a (P,) A°a(p,)} V (a (pj) A*a(p,)) (D1, Ar n) say be 
, : a pjan 
heuer e [log (Puilpy). log (Pai Poo). «+++ log (pya[pyo)], and A® = (20) is PN Ve with 
covariance matrix of n4, na, ..., ny under Hy: p' = pi, and where f), is suppose d to vary 
that is depend upon, p,. It is thus evident that, for a fixed c, the critical region 
j 3) 
‘ ,, &' (p) [n py] [s 
w(pi,c) = fn DIA a S 
: V (a (p.) A?a(p)) ^ ^ " 


is the most powerful critical region for testing p' = pj against a specific p = Pi ( pit Po 
1 


à level of significance Pp, c, 1). Since the composite H: p' pọ is the union of a T josite 
We use the union-intersection principle (Roy, 1953) and take for H, against the Co! : ) 
H the critical region er 
w(c) = Uy+p w(p/, c). 
Thus we should have for the complement of w(c) 
5) 
[n:sup iEn pd < p 


‘raue (p)Aða(p)} | 


S. N. Roy anp S. K. MITRA 369 


Since Xn =n and F Pio = 1, we can write 
E i 


a'(p)[n—7»p,] $ b'(p)[n* —p$ ] (4-1-6) 
Afa'(p) A"a(p)) {b (P) Ak. b(p))" 
Where n*' = (nj: No- Mea)» Di = (Pio: Poo: +++» Pr-1,0)> 


b'(p) = (b,(p). bs(p). --- bx a(P))> 
bip) = a«(p) — (p) = log (PiProlPrPio) | G7 -> k—1) 
and Aj. is the matrix formed by omitting the kth row and the kth column of A?. Notice 


t s z 
NE each b,(p) can assume any value on the real line and conversely, given any real vector 
39 gs... bj. 4,9); the equations b'(p) = bg have always a unique solution in p; thus 


DilDy = (Diolpio) ^» = Aio SAY; 


Or k—1 " icc 
D; = Aio (1+ p» Au) (t=1, 2, ..., k— 1) and p,—1 (145 Xo): 


Hence we have (Roy, 1953) 


a'(p[n—ap,] _ b‘(p)[n*—xpo] — x f[n* — npZY AS [n* —np*ņt. 
sme 7 oP Jeb" (pyALabi)) ~ * n^ "PST Abel" mei 


© ext observe that 


Iri oh = -nPop if i+j and of = npi (1 — Pio). 
8 easy to check (Roy & Sarhan, 1956) that if (A2) = (25), thena; = 1/(nPro) if? + jand 
5 7 lip) + 1/(npjo). We have, therefore, 
p 2D) [nnp] ., | E tucan s 
sab: (a (p) A*a(p)) i3 MPio 


4:1-5) we see that (4:1-4) becomes 


Brom ç 


mns S (c pal | mu 
w(c)= fn : +È pa - c ( ) 
the left-hand side of the inequality in (4-1-7) is essentially non-neg: 
fons We obtain a non-trivial solution only when ¢ > 0. It is thus seen that the 
Nati 18 obtained by using the union-intersection with respect to variation over the : les 
Side f P (+ pj), keeping fixed a quantity c defined (in terms of p; and n) by the night han 

it is] 4-1-2). This means of course letting # vary with p in an appropriate manner. Now 
Weh, ee from the asymptotic normality of the left-hand side of the inequality in (4:1-3), 


€, as 01 —-00, 


Since 


ative it is easy to 
See t ative it is easy 


x? critical 


1 E 2 4+]. 
fpi. c. 1-3]. eH dt. (41:8) 


nh 
Waie Samples it is thus seen that keeping c fixed means making f the same for all p,’s, 
tep Me critical region (4-1-7) isa union-intersection critical 
's (the approximation would be good enough even 
] known that the left-hand side of the 


on ans that in large samples the x 
for Of type I (Roy, 1953). For large n; 
tegu, Stately large values of the 7;’s) it is wel 
“Stig nid in (4-1-7) is asymptotically distributed as x? with (k — 1) degrees of freedom. Fora 
"tory proof see Cramér (1946, chapter 30). 


24 
Biom. 43 


370 Non-parametric generalizations 


4-2. Test of a composite hypothesis against a composite alternative 


Suppose that the composite null hypothesis is given by 
Hy ipi D4, Oos «+s 0o, opens 


where p,(0,, ...,0,) are k known functions of r (< k) unknown parameters. The hypothesis 
does not specify the values of the parameters except that they belong to a certain para" 
metric space Q. The (composite) alternative is H, + HM. For any given (00, 08, veg p) V. 
obtain, as in the previous section, a heuristic test of the hypothesis Ho: ip; pi: 0) 
against H; +H, which has a critical region 

& [n mph, «+ WYP Soo) (£21) 


xc. 00, ..., 00 iX > 
ife 68 r) a fai Taher 


fic 
This critical region is the region of rejection of Hj: (p; = p:(09, ....9r)} for a spe 


(09, ...,02). Now to reject Hy: (p; = p,(, -.-,9,)}o,,...,0) eo would be to reject 


Hy: p; = pi(0], ..., 99) for every (09, ...,09)€ Q, 
s Sposa 5 . H, the 
and thus using the union-intersection principle for the second time we have for Ho 


critical region 


w(c, Hy) = n w(c, Ory ...,0,) 
(Ory ...,0) €] 
I .9:2 
= fac, ing $m Pa en 
(,..,00eQ izl NPO ...,0,) 


o 
E ies is vee . : -a in terms 
which is precisely the minimum x? critical region. The equations giving the 0j's Jn te 
the n;'s, in the form for minimum y? are 


^ k 2 D) 
2 [m= mp (Oy, -- OP o TOTNM. (42 ) 
00,7 — np;(O,...,9,) the 
It has been shown by Cramér that for large n;'s the equations (4:2-3) can be replaced y 
maximum-likelihood equations which are simpler to use, the likelihood function bee 
n! rs ( 428 
dy — Tid TI PIO, .., 6). 
i 
The maximum-likelihood equations can be put in the form 
5) 
Q k qw. 0p; k n;—ND; Op; (4? 
0 lo. i ER i Pi CPi e ensis 
3g, ^ %o up, B oap, 9», O 1,2, ...,7) 
5. LARGE-SAMPLE TESTS OF THE HYPOTHESES IN $2 iple 
a multiP 


Assuming that DL = No; (fixed) and Xp; = po; = 1 (where i itself may be pe 
i 


me 
e like t,7,...%, and j may be a multiple subscript like ja J» ..-J) and usin’ io 
traditional approach of conditional probability, we have, under any hypothes® that 
expresses the p,,’s in terms of a lesser number of free or nuisance parameters, the TOBU y ane 
as n> 00 subject to n/n being held constant, Y; (n5 — noj Da] (220 Dis) tends to b? 
ij 


S. N. Roy ax» S. K. Mirra 371 


X"-distribution with degrees of freedom equal to the numbers of cells minus the number of 
linear constraints on the "gs (which we shall here replace by the number of separate 
multinomial distributions) minus the number of independent parameters that are estimated 
from the data. Notice that the f;s are the maximum-likelihood estimates of the p,,’s 
obtained by expressing the p;;sin terms of the lesser number of free parameters from the 
hypothesis, then substituting in ¢, and then maximizing the ¢ with respect to these free or 
nuisance parameters. However, here we do not rely on the traditional approach but upon 
* proof of the above theorem which starts from 


! 
Noj! 


é=1 F E nay]. 


j t 


and proceeds along the lines followed by Cramér. This, among other theorems, is given in 
* paper to be shortly submitted to Biometrika. 


5. 
l. The problem of § 2-1 
We Consider § 2-1, start from (2-1-1), maximize log à, with respect to the Dio S and Poj S 
Subject to Xp — = 1 (using Lagrangian multipliers) and obtain the maximum- 
u Pio = 24 Poj = 


likelihog d "MM = ndn and Po; = fun The number of independent parameters 
SStimateq from the isis viia 2, and hence by $5, the test of independence here is 


“Sed on a statistic which has the x*-distribution with degrees of freedom 


a ps—1—(r-+8—2) = (r-1) (6-1) 
"d whose form is 


2 
2 Nip Noj 
: Nig Poj a 40 “OJ 
tg zi n 
m a LE LR ie (5-1-1) 
AMT» 
D Nio Noj ig Moos 
ij nv 2 
n n 


5.» 

TN ge Pus 's subject to Y land 
ae ee E ect to > <= Ls 
9 start from (2-2-1) and maximize log o with respect to qoz 5 subj z doj 
Obtain the maximum-likelihood solutions: ĝoj = n/n. The ond of Puce T 
ters estimated from the data is s— 1, and hence by $5 the test here 18 to: be ba 

Atistic which has the y2-distribution with degrees of freedom 

anq r(s-1)- (6-1) = (r- 16-0) 

Whose form is 


2 
Dil? (x mats) 
~— hig uj 
(vs Nio n X n i (5-2-1) 
Say 0. 7 WS M 
e Noj ij 
i j Bei 
nion n 
5g 


“ The 
problem of § 2-3 . - 
der the null hypothesis, Pij = tym; 
“ tee T ei pu Side udi on a statistic Taru the eo 
trip Pg the remarks of $5 we note tha EO D wot e 
ton with degrees of freedom 7s — Grei) — (r— 


2 
Nionoj\ [Piono (5:3-1) 
s(n- n n m 
F 


24-2 


372 Non-parametric generalizations 


6. LARGE-SAMPLE TESTS OF THE HYPOTHESES IN $3 
6-1. The problems of § 3-1, i.e. where ‘i’, ‘j’ and +k’ are all ‘variates’ 


6-la. The problem of $ 3-1a 
Independence between ‘i’ and ‘j’ for fixed ‘k’. Under Hy of (3:1-1) we shall have 


Vl 
Poo IT (Pior Poj/ Poor)". (61-1) 
uj 


3, 
To test the hypothesis here we maximize log ġo with respect to the Pox’ S, poji S and pue 
subject to E Pior = X Por = Poor aNd LY Poo, = l, and obtain the maximum-likelihoo 
i j D 


solutions Pio, = Riox/%, Poje = noj,/n and Pook = ngy;[n. The number of independent para- 
meters estimated from this data is (r — 1)t4-(s— 1) t-- (L— 1). And hence by $5 the test 0 
conditional independence is here based on a statistic which has the x?-distribution with 
degrees of freedom rst — 1 —t(r—1)—1(s—1)— ((— 1) = t(r — 1) (s— 1) and whose form i$ 


2 i 
3 Reon Moje” [Gor ogk (6-1-2) 
NESSUN fe 
Lik Nook Noor 


Independence between ‘i’ and ‘k’ and also between ‘ j and ‘k’, This can be handled exactly 
on the lines of $5 and will not be discussed separately. 3) 
Independence between ‘i’, ‘j?’ and ‘k’. To test this we start from the hypothesis of (3:1 
giving 5.43) 

Poe ll, (Pioo ojo Poor)”, (e 
Uj, 
maximize log øy with respect to pjoq’s, Dojo Sand poy,'ssubject to X Pioo = Xi Pojo = X Pook E. 
. ; "e i j "i [te 
and obtain the maximum-likelihood solutions: Pioo = Riool®: Pojo = nojo[* and Poor = nw 
The number of independent parameters estimated from the data is (r-- s-- t — 3), and pe é 
by $5 the test is here based on a statistic which has the x2-distribution with degrees 
freedom rst—1—(r+s+t—3) — rst r—s—t4- 2, and whose form is 


5 44 
> (nas — rte Nioo Pojo Nook (6:1 
ERIN A n? n? 


6:15. The problems of $3-1b 
Independence between ‘(i,j)’ and ‘k’. Under Hy, of (3-1-4) we shall have 


Poe TI (DijoDoor)" ^. 
i,j,k 


; , „oct tO 
To test this hypothesis we maximize log po with respect to p;jj's and Poor 8 subje? and 
= Pijo = È Poot =1 and obtain the maximum-likelihood solutions: Pijo = nil" s 
s i i 

data 


Pook = nooz/n. The number of independent parameters estimated from the ;putio? 
tri 


(rs — 1) (L— 1) and hence by $5 the test is based on a statistic having the y?-dis i 
with degrees of freedom rst — 1 — [(rs 1) +(t—1)] = (rs— 1) (L— 1) and having the for 


,p0) 
2 d 
Si (n, ts N50 00k (6 
ij, k á n n 


S. N. Roy ax» S. K. Mirra 373 


Independence between ‘i? and ‘k’ and between ‘j’ and ‘k’. Since this can be handled on the 
Same lines as in $5, it will not be separately discussed. 

The ‘no interaction’ hypothesis of (3-1-6). This has been discussed in detail in another paper 
by Roy & Kastenbaum (1956) and will not be discussed here. The test will be based on a 
Statistic having the y2-distribution with degrees of freedom (r — 1) (s— 1) (t— 1) and having 
A rather complicated form (see Bartlett, 1935; Norton, 1945; Roy & Kastenbaum, 1956). 


"2 s . . : PEN : : 
6-2. The problems of § 3-2, i.e. when ‘i’ and ‘j ^ are ‘variates’ and ‘k’ is a ‘way of classification’ 


92a. The problems of $3:2a 


Independence between ‘i’ and ‘j’ for each k. Under H, of (3:2-1) we start from 
Go TT (Dio Poje)” 
ijik 


an T€ 2 A "E. p" 
d maximize log by with respect to the pio; S and po; s subject to E Pior = X Pose = Pook = Te 
i j 


2 i H ^ ^ 
"d obtain the maximum-likelihood solutions: ior = Tao too Poje = Tojxl to, The 
s to be estimated from these data is /(r— 1) - t(s — 1), 


nur š F 
Sr Mber of independent parameter: nkk : 
nd hence the test here is to be based on a statistic having the y?-distribution with degrees 
aving the form 


9! freedom t(rs — 1) —t(r—1)—t(8—1) = t(0—0 (s= 1) and h 


2 
Niok2ojr\ Tag Mosk - 
E| E (nir noor pa Nook 43. J) (6-2-2) 
: Noor dor 


k lig 


The problems under (3-2-2) or (3-2-3) will not be discussed separately. 


69 
> The problems of §3-2b 
The hypothesis that p; is independent of ‘p, Under Hy of (3:24) we start from 


g IT. qu. (6-2-3) 
i j, 


Maximize log J, with respect to the qjjo'8 subject to Elio = b and obtain the maximum- 

likel; ij I 

‘Kelihooa solutions: 9;;9 = "^ ijo/?- The number of independent parameters to be estimated 
pm the data is (rs— 1) and hence by $5 the test is to be based on a statistic having the 

fo, tribution with degrees of freedom t(rs — 1)—(rs—1) = (rs— 1) (£Í— 1) and having the 
'm 


The Problems under (3-2-5) or (3-2-6) will not be separately discussed here. 
in The problems of §3°3, i.e. when ‘i’ isa ‘variate’ and ‘j’ and ‘k’ are ‘ways of classification’ 
Fh The problem of §3-3a 

The h, pis independent of ‘j’- Under Hy of (3-3-1) we start from 


(6-3-1) 


Ypothesis that for any ‘k’, Pij 
po II diok 
i,j,k 
anq ; y 1 i d 
Es. ; oe = l, and obtain aximum- 
Maximize log d, with respect to the gio s subject to 2 Tiox : aceite 


clit i 7 „of independent parameters to be estimated 
00d solutions Qiok = Tg | Rook The number 0 T p 2 


374 Non-parametric generalizations 


from the data is t(r— 1), and hence by $5 the test is to be based on a statistic having the 
X?-distribution with degrees of freedom st(r—1)—t(r— 1) = t(r—1)(s—1) and having the 


form nM JI » 33) 
iok \~ iok 6:3-2 
» È (raus d [oe] : ( 
ikli Noor Nook 


6-36. The problem of $3-3b 


This will be exactly on the same lines as the previous case and will not be discussed 
separately. We shall also omit a discussion of the problem under (3-3-3). 
6-4. The problems of § 3-4, i.e. when‘i’is a variate and‘j’ and ‘k’ are ways of classification ™ 

the sense of an incomplete design 

The hypothesis that p; is independent of ‘j’. We start from (3-4), put Pin = ion and thus 

have po I diog- On maximizing log ó, with respect to the q;,,'s subject to x lior — 
j,k 

we obtain a solution for the qjo,’s in terms of the n,;,’s which is a rather complicated E. 
of functions of n,,,’s of the same structure as the corresponding least-squares solutions » 
linear estimation. One or two such solutions for some linked block designs will be discuss? 1 
in a later paper. However, this solution, inserted in the X? functions, will have the X 


distribution with degrees of freedom (r — 1) 5 tj— (rt — 1). 
d- 


3 23 rum. d 
The hypothesis that p,;, is independent of ‘j’, can be handled on exactly similar lines am 
need not be separately considered. 


7. CONCLUDING REMARKS 


The extension from three to more dimensions does not present any essentially new a 
blems. One or two additional features of interest of such extensions will be discussed ition 

The authors believe that this is an attempt at a somewhat systematic e 
(i) which is based on a clear distinction between a ‘ variate’ and a ‘way of classification : m 
stems from differing experimental situations and sampling schemes, (ii) which sets 
different probability models for the different situations, and (iii) which poses different t e 
of hypothesis according as it is a ‘multivariate analysis’ situation or an ‘analysis of vere ee 
situation or something of a mixed type. Every type of experimental situation has te 
appropriate sampling scheme and appropriate probability model, and the authors be $ 
that it would not be proper to force some kind of sampling scheme and probability e he 
on the wrong kind of experimental situation and also to pose one kind of hypothes! s 
wrong kind of sampling scheme and probability model. 

For two analogous null hypotheses under two different probability model 
eventually the same x? with the same distribution under the respective null h 
But the powers of these tests, that is, the distributions of these statistics under the re 


, 


s we Bid 
ese”: 

poth a 

" spect! R 


purge 
non-null hypotheses, are entirely different. In large samples these powers would of E tman 
in the usual sense, tend to 1 in each case. But the asymptotic powers (in the sense s d 


and Lehmann, which is perhaps the only sense that is relevant here) can be obtaine E 
compared. They haye been obtained; they are different and will be given a later parse 

In many situations in which a particular hypothesis H, with an associated X" is the pat 
section of several hypothesis Hy, Hy, ..., with associated X, x5 es ib 50 happen? 


S. N. Roy anp S. K. Mitra 375 


X — y+... and y2, x3, ... are also independently distributed, but the additivity is notin 
the usual algebraic sense: it is only in probability and asymptotically as n— oo, and the in- 
dependence is also in the asymptotic sense. Take, for example, the hypotheses (3-1-1), (3-1-2) 
and (3-1-3) and let us call them Hp, Hox and Mog and Hy. We note that Hy = Hy, N Hj; N Hos- 
Let the associated X?'s be denoted by x1. x3, V3 and x”. Then, in this case it has been shown 
that, in large samples and under the null hypothesis, yj, X$ and x3 are independently dis- 
tributed and X++ y> xin probability. We have an exactly similar situation for the 
&oup of hypotheses (3-1-4), (3-1-5) and (3-1-6). These are situations in multivariate analysis. 

here are similar situations in analysis of variance also, for example, with the group of 
hypotheses (3-2-4), (3-2-5) and (3-2-6) or with the group (3-3-1), (3-3-2) and (3-3-3). But this 


Will not be true, for example, with a similar group of hypotheses on an incomplete block 
design or general types of designs indicated in $3-4. A proof of ‘independence’ and 
additivity’ covering the problems of this paper and some other problems (where ‘indepen- 

ence’ and ‘additivity’ exist) will be given in a later paper. 
i f We have defined the hypothesis of ‘no interaction? between ‘i’ and ‘j’ when ‘2’, ‘j’ and 
® are all ‘variates’ or ‘i’ and ‘j’ are variates and ‘k’ is a way of classification. It is a kind 
of bridge over the gap between (a) the hypotheses of independence of ‘i’ and £^, ‘j’ and ‘k’, 
and (b) the hypotheses of independence of ‘(i,j)’ and ‘k’. Through this mechanism we get a 
9rmula of which the special cases for 2 x 2 x 2 and 2 x 2 x t were obtained by earlier workers 
artlett. 1935 ; Norton, 1945). They obtained the formula through presumably some different 
echanism, and perhaps for the analysis of variance situations where ‘i’ and ‘j’ are “ways 
|. Classification’ and ‘k’ is a ‘variate’. The authors feel that for the analysis of variance 
Situations with frequency types of data the hypothesis of ‘no interaction’ in the form in 
i Which it is usually posed and tested may not be too meaningful, and they doubt whether, 
Sven With ‘normal variate’ type of data, it is very useful. The controversy that has raged 
or many years now over questions of interpretation is not without its own lessons. The 
authors feel this way about the whole concept of interaction in analysis of variance. 


Coking for a possible motivation behind the customary (and mostly ‘normal variate’) 
aly, sis, one cannot help feeling that factorial experiments (whether on the ‘normal variate i 
t © of data or more general types of data) present a problem which is essentially different 
9m that of the rest of analysis of variance, e.g. the usual tests of significance of treatment 


en coi: 
lew, ences, Assume, for simplicity, in 


the beginning that there is just one factor at, say, k 
evel, 


8. One might regard these as treatments, and test whether there are significant differ- 


Eng 
es à ar 
Ta; etween these, or, in terms of some charac 


ese in some order. But that does not appe e 
) characteristic in terms of which we have the k levels seems to be a continuous 


t, iate Which is observed at k levels for practical convenience and what is Ea eee 
o Je to lay down a statistical rule by which we can decide about the ‘best’ or ae 
Doing inthe range of the continuous variate, the decision rule being in terms of observations 

' discrete levels and the ‘best’ or ‘optimum’ being in relation to the first characteristic. 
at k and Llevels respectively, the problem seems . 


teristic, pick out the ‘best’ among these or 
ar to be the relevant question here. The 
®econg 


Akey; : 
to } "se, taking, for example, two factors 


© not t re are signifie: 
r o test whether there are sig : i à 4 
“Bar €d as treatments (which would really be a linear problem) or to pick out the ‘best 
ms of some characteristic), which again would be each 


here is a (second) characteristic in terms of which 


ant differences between these kl combinations 


a these or to rank these (in ter 
We h ly linear problem. It seems that t 


: ave the k levels of the first factor and a third characteristic in terms of which we have 


376 Non-parametric generalizations 


the Z levels of the second factor, both these (second and third) characteristics being supposed 
to be continuous variates. The problem is to lay down a statistical rule by which we can 
decide about the ‘best’ or ‘optimum’ point (in relation to the first characteristic) on the 
plane of the second and third characteristics, regarded as two continuous variates, us 
decision rule being in terms of observations at the Xl discrete level combinations. This 9 
course can be generalized to several factors. The customary analysis into ‘main effects » 
‘interactions’ of various orders, confounding, ete., all seem to point very stongly in this 
direction. Some work has already been done from this standpoint and further work is under 
way. This will be discused in a later monograph. 

In conclusion, it is a great pleasure to thank the editor and the referees for their valuable 
suggestions for the improvement of the paper both in form and in content. 


REFERENCES 


BARNARD, G. A. (1947). Significance tests for 2x 2 tables. Biometrika, 34, 123-38. ^ 
BARTLETT, M. S. (1935). Contingency table interactions. J. R. Statist. Soc. Suppl. 2, 248-92. 
Cramér, H. (1946). Mathematical Methods of Statistics. Princeton University Press. — nó 
FısHER, R. A. (1922). On the interpretation of X from contingency tables and the calculation 

J. R. Statist. Soc. 85, 87-94. 


postum 
NzvwaN, J. (1949). Contribution to the theory of th mpos 


2 


test. Proceedings of the Berkeley Sy 


on Mathematical Statistics and Probability, pp. 239-73. University of California Press. statist 
Norron, H. W. (1945). Calculation of chi-square for complex contingency tables. J. Amer * 
Ass. 40, 251-8. d 
a classe 


Pearson, E. S. (1947). The choice of statistical tests illustrated on the interpretation of dat 
ina 2x2 table. Biometrika, 34, 139-67. Aki 
RxrERSOL, O. (1954). Tests of linear hypotheses concerning binomial experiments. Skand. A 
Tidskr. 37, 38-59. i al 
Roy, 8. N. (1953). On a heuristic method of test construction and its use in multivariate 2n 
Ann. Math. Statist. 24, 220-38. 
Roy, S. N. & KASTENBAUM, M. A. (1956). On the hypothesis of no interaction in a multiway conti 
table. Ann. Math. Statist. 27, 749-57 P 2° 
sii S & SARHAN, A. E. (1956). On inverting a class of patterned matrices. Biomet! 
7-31. d 
ndon 


Yure, G. U. & KENDALL, M. G. (1950). An Introduction to the Theory of Statistics. 14th ed. Lo 
Chas. Griffin and Co. 


quar- 
js 


in gency 


| 
| 
i 


A TWO-SAMPLE DISTRIBUTION-FREE TEST 


By A. R. KAMAT 


Fergusson College, University of Poona 


E € Mann & Whitney proposed a test for the equivalence of two distribution 
E uen H hich was based on the sample ranks. Given two samples of size m and n they 
E usta hat the two samples be pooled and ranked in ascending (or descending) order of 
— a un sum of the ranks of one of the samples is then used in a test of the hypothesis 
a = samples have come from the same population or, if it is preferred, of the 
pet - of the equivalence of the two distribution functions. Such a criterion will 
ae most Ser ien to possible differences of location of the two distribution functions. 
equal in 2: (1953) assumed that the location parameters of the distribution funetions were 
of dune " the criterion he suggested could be used to test for the equivalence of parameters 
E — by counting the numbers in one sample falling outside the range of the other. 
Riven, mative test to Rosenbaum's is discused here, using a criterion based on the rank 
Ses of both samples. 

"m i assumed that there are two samples (ed (i= 1,2...) and (yj (J = 1,2... m; 

i bes E location parameters of the distributions of x and y are assumed to be equal. 
bus n es are pooled and arranged in order of magnitude. Let P, and R, be the range of 

t and y respectively. The test statistic proposed is 

Dy, n xm R,— RE, (1) 
Large or small values of D, m Will indicate 


Where 
rs of dispersion of the populations 


Possible E m can take values 0, 1, eum +n. 
tom whi vergence from the hypothesis that the paramete 
ich the samples have been drawn are equal. 
aleulation of D,, m: Two analysts in the same 
he percentage fibre in a sample of soya cattle 
mixed powder. The first analyst made 
ere as follows, the ranks 


ins e following example illustrates the c 
Cake d made repeated determinations of t 
= in had been ground down into a uniformly 
of each ji ra and the second m = 10. The observations W 
» from 1 to 18, after pooling being shown.* It will be seen that 


Day = 12-11410 = 5. 


he z A 
question is whether this result suggests a significant difference in the accuracy with 


Whi 
Ich > 
the two men can repeat their analyses: 


An 
Plyst 1; Observations 12:38 12:53 12:26 12-37 1248 1243 1230 1246 
Analy Pooled rank 8 2 14 9 3 6 12 E 
St 2: Observations 12:45 1231 12°31 1220 12:26 12:73 1242 1217 12:09 12:23 
Pooled rank 5 10-5 105 16 13 1 7 17 18 15 
2; m= 10, R,-218-1217. 


n=8, R,214-2-1 


fw, " s 
We enter the table of percentage points (p. 378 below) with m+n = 18, n = 8, it is 
at for a two-tailed test, using a 9 o% significance level, the lower and upper limits for 


s. : EE 
$ ware 3 and 15 respectively, so that the observed value of 5 is not significant. 


ngs of 12-31 % by analyst 


Seg. 


Hen ' 
Te is one tie, i.e. two readi 2, but this does not affect the value of Dy, i. 


378 A two-sample distribution-free test 


Table 1. Percentage points of D 


nm 
For the lower points, P{D,, m<D,}<«; for the upper points P{D,, m> D,) «o, with 100% = 0-5, 1:0, 
2-5, 5.0. A stroke indicates that no value of D, satisfies the inequality. 


a 
| Lower | Upper | | Lower Upper | " 
percentage points | percentage points | percentage points | percentage pom 
m+n) n | | _|m+n| x | = a 
| 
05 L0 25 5-0 5-0 25 L0 05 | D LO 25 50|50 25 L0 05 
| | | | 
| | T 
7 2 6 16 g pæ ae = EE UN 
3| | ai = o © |e Ta: 10998 
&ál— 6 x il im 10 18 
8 | 2 7 blo 0 1 glim 15 16 16 
S 8 6/0 1 2 3|14 14 16 18 
Ee d dium a 7/1 1 2 3|18 14 16 E 
| 8|1 1 2 $|is 14 15 1 
9 | 2 = E: 
38————|99—-—|[mnm|z]——-— —| 15 19 1 
4[———0|9 9— — 3|— — 0 0 15 15 M 5 
4|— 0 1 2]|15 16 17 : 
ie te 4]. 9t dg a 5|o 1 2 2]|15 16 16 D 
3|— — — —|10 10 — — ejr 1 3 B lato 15939 n 
sa = 5 0110 i e = gi i zr @ 9 à 15 19 
bill = © olw 10 = = 8[1 2 3 3 |14 15 15 | 
| 
u|2|——— —|1 10 — —| 13 | 2, — — — —|15 16 17 | 
3|——— o|n n — — 8|— — o 1/16 18 15 x5 
4|— — 0 0|10 n n — ajo o 3 &|i mm 
$|—— 0 1/1 n nu — B|o 2 9 $|H ig ui 
6|1 1 2 $|1& 16 JT pa 
N ————19 WM = = yi 2 3 4 | 16° 16 eee 
3|— — — o|n 12 12 — elt 2 3 4|15 15 16 ir 
4/— — 0 o|n 1 12 — plr 9 s 4/14 15 J6 
Shes "Occ WoW" Ts 32 a E 
& Eso xa, 35 99 | m8 | gj — ay i8 
S£ e io i | 1g: 5 Tee 
ye — Ste — _ ajo o ri 2|7@ 18 28 oe 
è |= — Oi 49 48 — S10 1 m ajm UU m 
4/— — 0 1]|12 13 13 13 6/1 2 g 4/18 17 Jos 
$00 1 Ti ie 13 B 71/1 2 3 4|16 16 17 17 
$.| 0 0 1 1/12 2 33 13 s|2 2 a3 4|18 16 1? 5» 
aja 2 2 a| m u 
NNA | ex 19 8 em E. 
3 oc H2. Te 76 | um |e le mm oe ele 5 2 T 
4 |— o 0 11/13 13 14 14 s| = w L| u w Bod ' 
5|0 0 1 2il 13 14 14 410 o 1 z|ivz 19 19999 j 
$50 0 1 2]|12 13 14 14 8|1 1 x 8 | 18 18 M5 LET 
POO” 0p Salis Si. 12. d eia 2 3 ajir 5 Ceara 
is 7|* & 4 sir TL E 
ove p—— ne 1 1S T4 qas E sl 5 a 518 17 CER l 
eI — 30* (0| 18. ie ib d6 9/2 3 a § |16 17 97 35 
ae 0! Aor “Ie a. là; d$ "B wo | 2 3 & 4 \a6 36 2% 
5|0 0 1 21/13 14 15 I5 
BD 2.1. Bilis 14. i4 15 
T| 21 @ Glas. i3 eds 


O 


peu ESE EREDPEDESIEU"iai ^ 
^—""————————— -s OINMATS 
—_— O 


4. The probability distribution of D,, m follows from the application of simple com- 
binatorial rules. The total number of ways in which the sequence can be arranged is i * " 
n 


and this will be therefore the totality of the values of D, m This totality of values can also 


be built up as follows: 

(a) R, =n AL, Rm =m-l; Dim =n. 
There are only two ways in which this c 

? places or the last n. 

6) R, =n-14i (i = 0,1,...,m—1), Bn = m—14+7; Dim = 

and the last places and the x-sample n of the m --n— 2 

al n — 1-- i. This ean be done in 


an happen. Either the a-sample takes the first 


i Let the y-sample occupy the first 
maining places, remembering that R, must equ 


n+i-2 


(m-i-o( E ) ways. 


ice and Chance, chapter III.) 


(Sce, for instance, Whitworth (1901), Cho 
1) Da = n+m-—j. 


c H H 
ia Im, Ry 2 m-14j. (j = es 
Y symmetry the number of arrangements is 


i nu "- 
(n—1-3)( mm way 

d H H 

R,- STR, R,-m-ltj Dam = ATEI 


Excluding (a), (b) and (c) it can be shown that the number of ways will be 


a3. 1$ (m-18J- 1,2,...,(n—1)). 
j-1 
Comas: : 
Mbining these four cases we may write 
> g Ele 
Pip m. 1 ol hs e 1 
{ mm =2+i-j ="} = Taam paal ai + 57 es | 
n 
2m--n—r—2 


n+r—2 EE 2 2 
saia n-2 Jen 7h »( m-—2 )+ E. ( ) 
2lfore«m,D,-1 forr»m,E,-1forr-mn 
ples are of the same size a certain sim- 
— n, whence 


Whe 
te A= 1 forr<m, B,=1 for r>m, C, 
two sam 


an, 
they are zero otherwise. When the 


Mifcation of the probability distribution function is possible. Write m 
1 Cut Tg indus (272) ean] " 
PD, se rore HUE n d de 7) n-2 | E (3) 


PA 
i n 
Db. 
pA OUR lforren, E, 


Of € 
hie Percentage points for m+” < 2 


i, o Det 
"-bution function of D,,,, given above. Lo 
i». © Calculation of percentage points from the exact distribution clearly becomes 


Tacticable when the sequence becomes at all large. 


— 1 forr = n and they are both zero otherwise. Values 


0 are given in Table 1. They are calculated from the 


380 A two-sample distribution-free test 


5. The moments of the distribution of D, „ can be obtained from the distribution itself 


n,m 


or by considering the moments of R„ and R, and their cross-products. Whichever way i$ 
chosen, the combinatorial sums are easily reduced by making use of the relation 


£ v9 E re. 
ixpX 4 k+l ? 


although the reduction of the resulting polynomials is a long process. The first three 


moments of D, m are 

i esum 2m " 2n 

A= n+l muU 

1, = 2m(n— 1) (m n 1) 2n(m — 1) im n1) | 8mn | Jen 

Ps (n 1) (n 2) (m+1)2(m+2)  * (m+1)(n+1) n 

ü ..2m(n — 1) (n—3) (m - n - 1) (2m - n 1) | 2n(m — 1) (m—3) (m 4m 1) (m+ 2n4- V) 
3 (n+ 1 (n+ 2) (n 7-3) "in (m+ 1y (m+ 2) (m 4- 3) 


24(m —) (m — 1) (n— 1) ((m m 4 1) (ment 3)— mn} 
= (mar y? (n+ 1)? (m+ 2) (n 4- 2) 


 12(m—n) [| 4mm  — , n Dn feo 
miD imens ^ 6220-19-92] 4 


a) 


į ase 
The fourth moment was also workedout but is not reproduced here. For the symmetr ical ¢ 


I — n, 
PUE o NM «li H 
(n+1)(n+2) n 
fz = 0. 
6. Figs. 1 and 2 show the form of the distribution of D, ,, for m+n = 16, m = 8 vhich 
The distribution istrimodal and it is not expected that the beta ratios, some values a ko 0 
are given in Tables 2 and 3, will be very helpful in deriving approximations by ems 


extend the percentage limits of Table 1 beyond m+n = 20.* Some light is, however; in 
on the problem by considering the limiting distribution of Dii: 


Ee ADS , 386 
T. The limiting distribution has been obtained by D. E. Barton in an addendum P 
following this paper. Let m+n = N and let N increase so that 


m[N—p, n|N—>q = 1-p. 

If we write d = Da,m— m, the limiting distribution of d is 
f(d) = p-"g*(2p?|(1 —Q)-1—d) (d<0) (6) 

2Q/(1-Q) (d=0) 
g'P*2¢?/(1—Q)—1+d}  (d>0), 


d Use of the Pearson-Merrington table of standardized deviates (Pearson & Hartley, 1954, 
with the correct mean, standard deviation and beta values and the application of a continu! 
tion, has been found to give surprisingly accurate estimates of the 2-5% points at m+” ^ 
possibilities of this method of approximation, however, need further investigation. 


ll 


ll 


Table a 
pre 
ty cono 


020 2 4 6 8 10 12 14 16 
A 
0-18 


| 0:16} — — — True distribution 


Standardized limiting 
distribution 


o 
o 


Scale of f (D) 


0:08 


0-06 


0:04 


0:02 


0 2 4 6 8 10 12 14 16 


Scale of D 
Fig. 1. Distribution of D,,,, for case m+n = 16, m=8. 


2 4 6 8 10 12 14 16 


— — - - True distribution 


Standardized limiting 
distribution 


Scale of f (D) 


0 2 4 6 8 10 12 4 46 
Scale of D 


Fig. 2. Distribution of D,,,, for ease m+n= 16, m=12. 


382 A two-sample distribution-free test 
where Q = pq. When p = q = 3, we have for the symmetrical case 
fla) =4 (d=0) ) a) 
—452-4!(3]d|—1) (d+0). 


The first four moments of the limiting distribution can be derived from the values 
obtained in the finite case. They are as follows: 


é(d) = 2(4—p)Q7 
Hald) = 4— 6971-2977, (8) 
fxd) = 2(q—p) (Q1 — 5Q + 2979). 
Hald) = 28—174Q-1 + 266Q-2 — 144Q-3 + 24Q71. 


Table 2. Momental constants of D, m for case m+n = 16 


n,m 


n m Mean | S.D. Ita Jf Bo 
| 
2 14 49333 | 3-805 14-4789 0-669 2-681 
3 13 6-9250 | 3-616 13-0765 0-178 2:340 
4 12 T8154 | 3271 10:7022 0-000 2-404 
5 11 81667 | 2-993 8-9567 — 0-069 2.545 
6 10 8.2338 | 2-799 7-8355 — 0-072 2:599 
7 9 8-1500 | 2-687 7:2188 — 0-044 2-615 
8 8 8-0000 | 2-650 7-0225 0-000 2-618 
I I 


Table 3. Momental constants of D, „ (symmetrical case) 


n,m 


n-m S.D. | He Bs 

2 1-155 1:3333 3-000 
3 1-673 2-8000 2-500 
4 2-014 4-0571 2:413 
5 2-250 5:0635 2:434. 
7 2-550 6-5012 2-554 

10 2-796 7.8182 2-732 

15 2-998 8:8953 2-955 

eo 3-464 12-0000 3-583 


Note. For n=m the distribution is symmetrical, with mean at n. 


It will be seen that if m and n are large and equal, i.e. p = q = 1and Q = pq = 1, the limiti? | 
values of the second and fourth moments and of the £, ratio become 12, 516 an 
respectively. if | 
8. Table 4 shows some values of the moment constants for the distribution of if. g 

these are compared with values given in Tables 2 and 3 it becomes clear that the PU. 

form has been nowhere nearly reached when m +n = 16. Nevertheless there i$ some Ë ge 
larity in shape, at any rate for m and n not too unequal, which is brought out W nen "n 
limiting distribution is adjusted to have the same mean and standard deviation ? ; 


A. R. Kamat ip 
distribution of D 


n,m- This is shown in Figs. 1 and 2 in which limiting distributions with 
P = $ and $ have been fitted onto the distributions with m+n = 16, m = 8,12. The agree- 
ment which can be said still to exist when m = 12 has vanished for m = 14 (as could be seen 
Were a similar comparison made). At m = 12 and 14 the limiting distribution gives an 
appreciable frequency below zero, when standardized. The adjusted limiting distribution 
has, of course, smaller intervals between its ordinates than the distribution for finite JV, 
Since 0,7 c, and this results in a greater number of ordinates for the former. Thus its 


frequency polygon is, in general, below that for D. This effect would be corrected if the 


Cumulative distributions were compared. 


Table 4. Moments and standardized deviates for the limiting distribution of d 


Dl» 
[Row E 
~ p=m|(m+n)> 0-5 0-6 0-7 0-8 0-9 
E 
| 2 Mean 0 — 1:67 — 3-81 —-50 [|  —rr78 
| 3 Momental constants for | S.D. 3-46 3-70 4-56 | 6-68 13-57 
4 d-D—m IB. 0 — 0:49 — 0-91 — 1:30 — 37 
| bs : 3:58 3:92 4-68 5.42 5-87 
4 2 
8 ) U . d* 8 (0-017) 6 (0-015) 4 (0-021) 3 (0-015) 1 (0-012) 
Pper 2-576 point i 7 (0-030) 5 (0-031) 3 (0-050) 2 (0-043) 0 (0-030) 
2 
8 \ Lowers . d* _7 (0-030) | — 10 (0-028) | —15 (0-025) | —24 (0-027) | —52 (0-026) 
ib (Bout yd L8 (0-017) | —11(0018)| —16 (0-019) | —25 (0-022) | —53 (0-024) 
9 
| Yo ) Corrected 1:99 1-77 1-57 144 1:29 
and stan- Upper : 
EE. feted 2-5 % points {sce —1:99 — 2-20 — 2:34 — 2-46 — 2.52 
* | 


which may be described as bracketing the 2-5 % point. The figures 


LC Th 
in ? quantiti 5 exo yal fd 
a Sah i. a Ya d from Barton's formulae): (i) Pid z d*) and P{d>d-} for the upper 


Dore, theses ar ^ iliti alculate 
P age "res t) pom m P(d «d-) for the lower percentage point. 

| 9. The following is a possible method of approximating to the percentage points for 
n,m When m+n >20, making use of the similarity referred to in the preceding paragraph. 
(a) First standerds" the limiting distribution. The method is illustrated in the case of 
he 2.5 95 points. Rows 5-8 of Table 4 show (for p = 0-5 (0-1) 0-9) the values of d which 
racket the = and lower 2:5 96 points. Linear interpolation between d* and d- gives 
“PProximate 2.5 % points to which a continuity correction of 0:5 has been applied. Thus 

o 341 = 0-02 — 951 — 0-05 
or the lower percentage point when p = 0:8, P(d« — 24} = 0:027 and P{d< b 0-022. 
Mear interpolation gives — 24-40 as a value for which P(d« — 24:40} = 0:025 were the 


at Jue we must add 0-50 as a continuity correction,* and 


Stribution continuous. To this va l 
nally have for the standardized deviate 
_ 24-40 + 0-50- (— 7:50) _ _ 9.46 


bution is continuous and no continuity correction 
miting percentage point. Here, both the distributions of d and D 
terval of the former does not agree with the unit 
procedure suggested may be described as adding a correction 
d scale to get the significance point for D. 


F . Emo . E 
isr In most problems of this kind the limiting distri 
o Quired to find the standardized li i 
i discontinuous, but the standardized argument in 
fo "val of the latter distribution. The 


and later removing it on & modifie 


6:68 
a 
figure given in row 10 of the table. 
j 


384 A two-sample distribution-free test 


(6) We now apply to the standardized limits the values of the expectation and standard 
deviation of D, for any chosen finite values of m and n, giving 


Percentage limit for D = &(D)+o(D) x standardized limit for d. (9) 


Thus in the case m+n = 20, m = 16 or p = 0-8 we have, using Table 4 and the expressions 
for i and 4z, from equations (4), 


Upper 2-5 % limit for D = 10-07 -- 3-994 x 1-44 = 15-82. 


1582 
1 
14 15 16 17 
Fig. 3 
Table 5. Comparison of approximate and true 2-5 % limits when m+n = 20 
| 
Standardized Percentage points Limits " Ms 
limits from Case m+n = 20 using standard- | corrected for iae 1 
Table 4 izing factors discontinuity be 
p m | 
| "Frasi 
Upper | Lower | Mean D | @p | Upper | Lower |Upper|Lower| Upper Tos 
— | pese a 
E | | | 4 
05 10 | 199 | —L99 | 1000 | 2796 | 1556 Ea" 17 3 2*4 
ve 12 | 177 | -220 | 1056 | 2912 | 1571 | 415 | a7 | 3 | 17 | 3 
0-7 14 1-57 — 2-34 10-80 3-288 15-96 | 3-11 17 2 18 1 
0-8 16 1:44 — 2-46 10-07 3-994 15:82 -25 17 = 19 Ci 
oo 18 | 129 | -252 | 621 | 4719 | 1230 | —&68 | 13 | —.| 38 
ry p H 34 ai 
; Note. The entries *—' in the last column and last column but two indicate that no value of D 
Significant at the lower 2-5 96 level. 
3 Jini centage 
If D were continuously distributed, 15-82 would be the approximation to the per pe ia 
point; as it is not, and 15-82 > 15-50, we cannot include the value D = 16 in the 24% ta 
region, but must be content with the statement P{D > 11) « 0:025. The position is illustr? 
in Fig. 3. 
. . H ` H case 
10. Proceeding in this way estimates of upper and lower 2-5% points for mA s 
res 


m+n = 20 were obtained as shown in Table 5. The true values, as given in Table 1, a 
for comparison in the final columns. For p = 0:5, 0-6 and 0-7 the estimates are not e n 
than one unit in error,* but for p = 0-8 and 0-9 the method is clearly inadequate for m à 
as low as 20. Some improvement can be obtained for these extreme values by using ti 
form of limiting distribution suggested by Dr Barton at the end of his note (p. 387) pu 


is doubtful whether the test should be used at all in samples of this size with n € 4? n 
hat ; 


* In the case p = 0-5 agreement is really very close; had the entry in col. 7 been 15:50 and t 
col. 8 7 4:50, the figures in cols. 9 and 10 would have been 16 and 4. 


A. R. KAMAT 385 


nies E ri for m+n> 20 and n] (m +n) not greater than, say, 0-75, approximate 
E s EN ained from equation (9). using values of the expectation and standard 
B ane - Fond, from the expressions for j/ and z given in equation (4). Values of 

ower 2-5% standardized limits for d can be obtained with sufficient accuracy 


Y interpolating in the following table. 


Table 6. Corrected and standardized 2-5 % limits for d 2 D, ,,—m 


a | | | | | | 
P= mmn) | 0-50 | 035 | 060 | we [ m | 0-75 
| | | | 
Pipes | 199 | rss | 1:77 1-67 1-57 1-50 
A | ias p.e 2-10 | — 2-20 — 2-28 —234 | LGA 
| | 


As r 
d an example, we may find approximate 2:5 % limits for the case m = 25, n = 15. 
Do 7 0-625 and interpolation in Table 6 gives 1-72, — 2-24 as the standardized limits. 
equation (4) we find 
Mean Dj5,95 = 23:08, Ma = 11-4236, s.p. = 3:380. 


This s; 
US gives as limits for D: 


D 23-03 + 1-72 x 3-380 = 28:84 and 23-03 — 2-24 x 3-380 = 15-46, 
> Corn : x z " A x 
mnes for discontinuity, we should take 30 and 14 as approximations to the upper 
Shane 2-5 % points. As 15-46 is only just below the borderline value of 15-50 we might 
ably regard 15 rather than 14 as the lower limit. 


12, : 
h As mentioned in the introduction the Rosenbaum test can be used for the test of 


e 
Of 6 th which we have set out. His test procedure consists of counting r, the number 
Sting "ipia of one sample which lie outside the extreme points of the other. It is inter- 
iem note that when one sample (say, the m-sample) is wholly included within the 
€ values of the other, the Rosenbaum r and D are connected by the relation 


D, 


p,m M+. 

age points of the D-tables and r-tables will show that the upper 
onding significance points of D, ans 
o deviations from the 
y be decided definitely 


Ac 
tignig Prison of the percent: 
I Might Y. points of m +r are never less than the corresp' 
YPothe be expected that the D-criterion will be more sensitive t 
Wh, Sis tested than with the Rosenbaum criterion, but this can onl 
© power functions of the two tests have been obtained. 


avid for taking a keen interest 


. Pir 
Ing nally T wish to thank Prof. E. S. Pearson and Dr F. N.D 
tion of this 


lei i 1 B . 
Baper investigations and for a number of helpful suggestions 1n the prepara 


REFERENCES 


Ann. Math. Statist. 18, 50. 


Maxx, H. B. & Wurrney, D. R. (1947). 
Biometrika Tables for Statisticians, 1. 


Pearson, E. S. & HARTLEY, H. D. (1954). 
Cambridge University Press. 
'OSENBAUM, S. (1953). Ann. Math. Statist. 24, 663. 


25 
Biom. 43 


386 A two-sample distribution-free test 


THE LIMITING DISTRIBUTION OF KAMAT’S TEST STATISTIC 
Addendum by D. E. BARTON, University College, London 


i i zi " ont. 
The limiting distribution of Kamat's D criterion is easily obtained from the following a sai 
Kamat shows that the joint distribution of i = R,—n + 1 andj = /2,, —m 4-1 is given by 


faj) = 2K (j20-i) 

= K(m—i—1)"*-C,, (j2n,0xixm-1) 

= K(n-j—1)"9—0C, , (0xj&n-—1l,i-m) 

= 2K tH- (0<j<n-1,0<i<m-l), 
where K- «mg. 
The marginal distribution of i is consequently 

f(t) = K(m-F1—i)"*i-?0,. , (0<i<m), 
for which the mean and variance are 
Ki =m(n—1)|(n+1), fy = 2m(n—1) (m+n+1)/{(n+ 2) (n+ 1)} 


That of j is the same with n and m interchanged. 


f ias 
If we let N = m+n tend to infinity so that m/N +p, n/N >q, we may write the moments 9 
M = Np-?plq O(N7), p, = 2p/q? + O(N>), 


te limits 2! 
Thus ifr = m—i, s = n—j,d = s—r = D—m, the first two moments of r and s tend to iue . The 
‘ also, since the correlation coefficient of r and s tends to —4/( pq), so do the first two momen 
probability distribution functions, equally, tend to a proper limit. Thus 
m(m—1)...(m—r+1)n(n—2) 
T) =(r+1 

I) = Drea m (N—r+1)(N—r) (N—r=1) 

and similarly the probability distribution function of r and s tends to 
(r,s) = 2p'Hq9 — (r>0, s>0) 
—(s—1l)p*q* (r=0, s>0) 
=(r—1)p'g (r>0, s=0). 


= (r+1)p'g@+O(N), 


The probability generating function 


T, (x, y) = Uf(r, 8) ary? = { 


pyle d y — xy) | 
follows at once from this, whence, putting 


(1— pz) (1— qy) 
x=e-t, y= elt, 
we have the limiting characteristic function of d as 
p(l - eit 4 e?it) | 
t) = (=, 
t= eat arl 


is i ; men 
From this it may be verified that the first moment of d is 2(q — p)/(pq) and the higher me P 
the same as Kamat's limiting moments of D. 8) 


M (r, 
The actual limiting distribution of d is discrete and most simply got from the formula for f! 


ga? 


f(d) = p~*q*{2p*—(1—Q)(14+a)}/(1-@) (d<0) 
= 2Q?/(1—Q) (d=0) 
-qpgy-0-9)0-4y(-Q9) (4-0) 

where Q = pq. This reduces when p-—-q-ito 


fd =s (4=0) ] 
-1:8|d|-1)Q)"! (d4-0). 


A. R. Kamar 387 


The tail probabilities may be obtained explicitly by summing the expressions for f(d). Thus 
ford>0: P(d) = P{s—r>d} = q*{pd—1+4+2¢/(1—Q)}, 
ford<0: P(d) = P{s—r<d} = p'*!{q|d|—1+2p/(1—Q)}. 


These may be used to obtain limiting percentage points. 
The modality of the probability distribution function is always triple, as it may be shown that (taking 
P<q without loss of generality) the following relation holds: 


see <f(—3) <f(— 2) >f(— 1) <f(0) 9/0) <f(2) ... <f(dinax,) > +++ 
Where dmax, is the integral part of 1 - (p — 2g?/(1 — pq))- 


The integer d, equals 2 if p» 3, but rises rapidly for smaller p (taking the values, 3, 4, 9 as p takes the 
Values 0-3, 0-2, 0-1). 
fnis kept finite but m — co, a second limiting form may be obtained. In the notation used above 


é(s) = O(N3) = var (s), é6(r) = O(N), var(r) = O(N?), 
Snd hence z = — d/N has the same finite limit as r/m. This is easily seen to be the Pearson Type I curve 


fla) = n(n— 1) a(1—2)77*. 


25-2 


[ 388 ] 


SEQUENTIAL ANALYSIS APPLIED TO CERTAIN EXPERIMENTAL 
DESIGNS IN THE ANALYSIS OF VARIANCE 


By W. D. RAY 


British Coal Utilization Research Association 


1. SUMMARY AND INTRODUCTION 


On the basis of some fundamental work due to Barnard (1952) and Cox (1952), J ohnso" 
(1953) has derived a procedure for applying sequential tests to the general linear hypothesi: 
Hoel (1955) obtains a similar test by a rather different method. 

The general linear hypothesis underlies a number of common analysis of variar j 
tions. It is the purpose of this paper to provide tables which will make it possible to carry 
out the suggested procedures in the cases of («) one-way classification by groups a 
(b) randomized blocks. In the course of construction of these tables, we have ams c 
a number of approximations to the confluent hypergeometric function and assessed th 
accuracy. 


nce situa- 


^ , P 3 n "o cesses 
Conjectural approximations to the expected sample sizes of the sequential proe 
considered are also given. 
2. A SEQUENTIAL TEST OF THE GENERAL LINEAR HYPOTHESIS x 
matr! 


In the classical fixed sample case the general linear hypothesis may be expressed inr 
form as follows: 


(i) Let x = (z, ..., vy) be N independent normal variables. 
(ii) &(x) = 0C', where 


9 = (6,...,6) = (0,,...,0, 4| 0, +++19s) = (805 89); 


On e Cis 
C=|: : = (Cw: €) (s«N), 


Cri, Cg 


where C is partitioned similarly to 0. 
(ii) Y(w,) = o? (i1, ..., N). ; s matri" 
(iv) The 6’s and o are unknown parameters; C is known and is called the design J ctio? 
The hypothesis to be tested is Hh: 05 = 0. The likelihood ratio criterion is then 2 7 n 
of G = S/S, where S, is the minimum of (x — 9C") (x — 9C/)' with respect to 8 gno: 
is the minimum of (x — 9) Cay) (x — 94) Ch)’ with respect to 64). of #8 
If observations are taken sequentially then at each stage we add a further set pee? 
a new row of C being added for each additional set. Thus after N observations haye 
taken we may calculate a value GM, say, of G. 
Under certain conditions, which are satisfied in the applications considered in t 
a sequential test comparing the composite hypotheses 


H: %) = 0, 


his P ape" ] 


H: agr = 9, 


W. D. Ray 389 
may be based on the likelihood ratio (Arnold, 1951) 
| PIG |p) _ eR y N-st+q q, ANE 
(G0) > 3 °F vas): (1) 
where ; Y Yrs "m 
x» = egt - egg OB) OW} CB 2 
and E 7 Eau E 
M(X, Y; u) — 5, TPS DY (2) 
E A FOOT $3) 
18 procedure is exactly a j : red i i à 
| Ypotheses, S is exactly analogous therefore to that employed in comparing two simple 


à e 2(G0|e). 1-2 
Accept H, if pa ? à 


p(GO|e). Ê 


Accept H, if pTO) * ers 


Othery; 
r : : 
Wise take further observations.’ æ, A are the (approximate) chances of erroneous 


Teje x 
E of Ho, H,, respectively. 
dently for all values of «p giving the same value for the scalar quantity A there will 


e th 
» Pli sequential procedure. 
Nos following sections we shall consider the application of this general method to the 
Decial cases mentioned in $1. In these cases we always have A = Nô, where ô does not 


Sper : 
nd on N, and so our alternative hypothesis H, can be defined in terms of à. 


S Thi 3. ONE-WAY CLASSIFICATION BY GROUPS 
| 18 is ; . A 
| JS the simplest design in the analysis of variance. The data are arranged in k groups 
dun ares g y f groups, 
i p i bout possible differences between the means of the 


ere ‘ : ; i 
nt groups, The sequential procedure which we shall consider consists of taking an 


ual 

1 ; 

Roos ir of observations from each group à 
ical model may be expressed in the following form: 


eq 
t each stage in the experiment. The 


th 


ay = at b ti (£21, k; i=l, sf); 


k 
n Y (à, - 8. 
t=1 
(Mu S —— 
y E (tu T.) 
t=1i=1 
xb 
X j OE and ó- L4 
ko? 
id Procedure is now: 
: an yg (Oak E-L. JAMGO) 1-2 
‘Accept H, if e a( sog m 4" 


l Where à 
I é(a;) = 9 V (%) = o, Ah zx il, 
N th; 2 
this case N = kn, s = k and q = bel: 
kn—1 k—1, WMG B 
Accept Ho if eva 3^" a 40m € re 
therg: 
T Wise take a further set of k (or mk) observations, one (or m) from each group.’ 


390 Sequential analysis applied to certain experimental designs 


M" : € z E E -e AN) e GO; 
This is equivalent to the conditions: (i) accept H, if GC? > GO); (ii) accept Hy if GO? € a , 
(iii) if G < GC) < GC? continue sampling, where GO), G are the solutions of the equations 


in GM): B 3B _ pepe u(@ =i k-1, E 


l-a æ 2 * 2 * Taam. 


The appropriate limits GC?, GC? for 6 = 0-5, 1-0 and 2:0 with probabilities of Wi. 
æ = fj = 0-05 and various values of i: are given in Tables 1, 2 and 3 printed on pp. 399-4 
below. 


4. RANDOMIZED BLOCKS 


The well-known model for randomized blocks with Æ ‘blocks’ and n ‘varieties’ is 


m = Gb EVA; (t=1,..., k; +=1,...,7), 


k n 

where $3520, YXw-0 
t=1 {=i 

and Elzu) =0, V (zu) = 0°. 


a " " : NT is to 
The sequential procedure with this design, having fixed the number of varieties ?; 15 


decide at each stage whether or not to add another block of the same n varieties. 
Hence we have N = kn, s = k--n —1 and if H be the hypothesis v; = 0 (i= b 7 
then q — »—1, 


n) 


n — 
kX (r,—z y 
QU a i=1 


k n 
D E(u-vz.-4x ez y 
t=1i=1 


, 


n 
kx v? Dv 
AM) = = and oa. 
D no? 


" p ure 
Analogously to the case of the one-way classification by groups, the sequential proced 


x „wise 
may now be defined by: (i) accept H, if G0 > Ga, (ii) accept Hy if GM < G, (iii) oe d 
add another block to the experiment. In this case GU, G) are the solutions of the follov 
equations in GC»: 5 


ee ee aaa 


B-Boy, (b(n—1) n—1, 4A0G0n 
TI—: = eh M E zl- 
l-a’ a 2 " 2 ^ y4 G9 

. š = ' “jou! 
The appropriate limits GC), GW for § = 0-5, 1-0 and 2-0 with a = f = 0-05 and yari 
values of n are given in Tables 4, 5 and 6 printed on pp. 402-403 below. 


5. DETERMINATION OF THE LIMITS G, G 

The determination of these limits entails the use of fairly extensive tables of the © 
hypergeometric function M (X, Y; u). In this connexion Rushton (1954) and Ru 
Lang (1954) have provided a fairly comprehensive survey and tabulation of the ee 
For the values covered by Rushton & Lang's tables, G and G were calculated by ie d 
interpolation of the function log M(X, Y; u) with regard to u. 

Unfortunately, the tables were not sufficiently extensive to cover all the cas? 
necessary to consider, For example, in the one-way classification by groups for even (0 ined 
only limits at stages of even (odd) » could be determined, while in the case of ran n^ 


onfluen? 

gbtoP 

tion 
e 


+4, wars 
S 


C —— ——— ge. 
———————————————— — —— —Má 
— — — o TUER 


W. D. Ray 391 


blocks only at odd (even) k for even (odd) n. So, too, it was found for some & and n that not 
close enough intervals of w had been tabulated to obtain rapid enough convergence of the 
Mverse interpolation process. 

In order to cover these situations, attempts were made to obtain useful approximations 
to the likelihood ratio or equivalently to the confluent hypergeometric function. 


6. APPROXIMATIONS TO THE LIKELIHOOD RATIO 
or G) distribution to a central 


(a) "The likelihood ratio R is the ratio of a non-central F ( 
v, = N —s and the 


(or G) distribution where the degrees of freedom of the F’s are v, = 4. 
Ron-centrality parameter is A. 
Thus pU" Hy)r-r _ DG" | H)e-a 
R= UH) O 61H) 


wher 
here P — vs G[v, and F’ = vyG'[v. 


eh TON -H ager een i; ie) 
Go TG cep Gem V ists 


Now p|H)- 


raq@+N—a)) G7 —. 
9G Ho) = pag) FE =s o «een? 


MG : 
Thus Rae ager —s4q) 3%; PES : (1 bis) 


b Patnaik (1949) has approximated to the non-central F-distribution by a central F-distri- 

a 2 

T having the same first two moments. torof R would bepo aned 

1 the pre i ' distribution in the numerator oi IW approximate: 
present instance the (q+ 2A). Hence 


O 
Y a ER, „ distribution, where k = (+ A/D” = (q 4-29] 


T4 v9) (5) Sas 
p(G' | B)e-e = Par) rra) Vak ( f jeri 


vk 


Whil T, v9) G7 
> mG | H) = Fer riva d+ e 


y if Gie- (1 Ger? (3) 
a E EERE 
E / (QNM C 
(5 (1+ o)" E 
vk 


s an approximation to the confluent hyper- 


"d thus , Tcv) Pr) 
R= Fay remt”) 


STE DY — (y Ay] Qj 22). 


Ts 
Beo virtue of (1 bis) this can also 
Stric function, 


be regarded ET 


) holds for all vı, v; ib was considered worth while 


to; ^ While th 'oximation (3 : 
th, Vestigate ia dad e simplification resulted as va 00- Patnaik (1949) has shown that 
e 
St two moments of F” are 
9 -1 
pay eee A -z) 
Ja tes E ( m 
(202) 1 ZZ ( - j 3 
p z— ve Py Vo 


392 Sequential analysis applied to certain experimental designs 


Hence for v, large (but not infinite) 
2 2) 934-2 
pF) = 2(v, - 22) ( = , 


ve 
and therefore v,(1 — 2/v,) F' has moments ji = v; +A, t = 2(v, + 2A). We therefore approx 
mate to the distribution of F’ by regarding v,(1 — 2/v,) F’ as being distributed approximately 
as y'? with v, degrees of freedom and non-centrality parameter A. 
Inserting the corresponding approximations for the distribution of G, we obtain 


. n& TA) (1a4(-2)Gy 
3A = NEE MONTE 
Ret Urge) — A 
5 gga EG a0 72) GH) (4) 


GAng ^" 


where J,,,_; is a Bessel function of imaginary argument. 
When », = 1, that is, for example, when k = 2 in the one-way classification by 8 


we have ES e^ T (9) L (AQ, — 2) GH) 
(AQ. 2) 
= e% cosh (A(v,— 2) ay. 


roup®: 


(5) 


s x i jon. 
This may be compared with an approximation to the confluent hypergeometric functio 
given by Arnold (1951). 
Using the relationship 


_ OY 
Ix(C) = rx41j MA +4, 2X +1; =p}, 


and Kummer's formula M(X, Y; z) = e? M(Y — X, Y; —z), we obtain (4) in the form 
Ree ees 9G" M(3y, — 4, v, — 1; 2(A(v, 2) GA). 


2 s 5 ent 
Comparing (1 bis) and (4) it'may be noted that we have in fact approximated to à conflu 


hypergeometric function by a similar function of a slightly different form. 


"8 


7. CONSTRUCTION OF THE TABLES OF LIMITS G, G 


re 
In order to decide on the extent of tabulation required, the results of Baker (1950) y 
taken as a rough guide. In an empirical investigation into the distribution of samp P 20 
in sequential tests comparing certain simple hypotheses he found that in less thar gize: 
cases did the actual observed sample size exceed three times the expected samp'e ted 
The tables were accordingly constructed to cover sample sizes up to three times the exP ed 
sample size, and it is to be anticipated that they will cover most of the cases encoun 
in practice. m t 
Wherever possible the logarithm of the expression (l bis) was equated in M 
log (1 — /)/o, log (1 — æ), and the solutions G, G respectively obtained by inverse ^o 
polation. By far the greater content of Tables 1-6 was arrived at by this method. To P or 
certain intermediate values when, in the case of the one-way classification, % qune sn de® 
when the range of the tables of the confluent hypergeometric function was exo go? 
recourse was made to approximation (3). Solutions G’, G' obtained by the Newton i atin? 
iterative method applied to approximation (3) were of great assistance in approx} 


to the exact values. It was found that G' was 
not equivalently as good an estimator of G, at least for sma 
as found that the rate of change of the limi 
(1 bis) or the approximate one (3). Thus know- 
rences for increasing n (or k) could be utilized 
mits providing a few exact values had already been 
btained this way are printed in italie type in Tables 


blocks). However, it w 
p almost identical using the correct form 
5» a 2 the approximate limits and their diffe 
E ain good estimates of the true li 

ained for small n (or k). Estimates o 


1-6. 


Table 7. Comparison of lim 
One-way classification by groups. ô=1, «= p 9-05. 
k= 


W. D. 


Ray 


a very close estimator of G but that G^ was 
In (or kin the case of randomized 
ts for increasing n (or k) 


2 


its obtained from equations (1 bis) and (3) 


n A G | G (eu a 
f 

4 8 0-052 2-390 0-096 2-410 

6 12 0-105 1-016 0-138 1-001 

8 16 0-135 0:710 0-164 0-700 

10 20 0-150 0-578 0-179 0-580 

12 24 0-166 0-504 0-195 0-502 
| 

n A | G 

= mm 

3 15 | 0-331 2-469 0-348 2-497 

5 25 0-340 | 0973 0-357 0-058 

7 35 | 9.331 0-687 0-347 0-685 

9 45 | 0322 0-565 0-338 0-568 

n 55 | 034 0-498 0-330 0-503 


Table 8. Compar 


y classification by groups. 


G, G were obtained from (1 bis); 


ison of limits obtained fr 
820-5, a= f=0-05. 


G', G from (3). 


om equations (1 bis) and (5) 


One-wa; 
k=2 
a A G G G" g 
ee ea | — - 
3 :012 0-918 
6 0-002 1-091 0-01 
8 3 0-025 0-639 0-032 0-608 
10 10 0-040 0-471 0-047 0-466 
12 12 0-050 0-385 0-058 0-387 
14 14 0-059 0-333 0-067 0-337 
16 16 0-066 0-300 0-074 0-302 
18 18 0-072 0-276 0-079 0-277 
20 20 0-076 0-258 0-083 0-258 
30 30 0-089 0-205 0-097 0-207 
= 


a, @ were obtained from 


(1 bis); G^, G" from (5). 


v, = 1. It is found that it gives reasonable agreement of the limits when n is fairly large- 
Once again, however, the rate of change of limits with increasing n (or k) is similar whether 
(1 bis) or (5) is used to obtain them. j 

Thus either approximation may be used together with some known exact limits obtained 
from (1 bis) to expand or interpolate where required in Tables 1-6. (Tables 7 and 8 give ? 
few examples of the accuracy of these approximations.) . 

It may be noted from Tables 1—6 that in certain cases no decision is possible before a cortan 
stage. This is because G by definition must be real, finite and positive. Thus in some T 
situations an initial sample must be taken before the sequential procedure can be put m x 
operation. 

8. THE PRACTICAL USE OF THE TABLES 


. nd 
Example 1. Suppose we require to take observations from each of three groups 3d 
postulate that the model of $3 might describe the situation. We wish to test the null hyf 
thesis that each group mean is zero against the alternative hypothesis that 


3 
x 


ô = She = 05. 


The probabilities of error a, # are taken to be equal to 0-05. 


Table 9 _ al 


Sum of squares 
2 d A 

n yy Lay; Eri; Zü Ers Zad, " "T zo 

| Botween | Within 

| | | | ee 

| 


06} 06| 036| 23| 23 539 | —78| —7-8| 6084 — a 
—44| —3-8 19-72 13-4 | 15-7 | 184-85 9:3 1-5 | 147-33 p pi 
15-1 | 11-3 | 247-73 3-8 | 19-5 | 199-29 | 13-4 | 14-9 | 326-89 — ES = 

28 | 141 | 255-57 577 | 25-2 | 231-78 | —6-1 8-8 | 364-10 — od 910 
-61 8-0 | 29278 | 9.9 351|329-9 | s1| 169 | 429-71 | 76-32 AN 007 

3-0 | 11-0 | 301-78 5:7 | 40-8 | 362-28 | 10-1 | 27-0 | 531-72 | 7414 eae 0:070 

0-6 | 11-6 | 302-14 | —1-1 | 39-7 | 363-49 2:9 | 29-9 | 54013 | 5812 | 8336 


i | 


anA 


zi 

Three samples from normal populations whose means are zero and whose standa 
deviations are 10 have been chosen to illustrate the procedure. 

In Table 9 the approximate calculations are shown, a value of @ being calculate 
stage of sampling after n = 5. Table 1 is consulted until one or other of the corresP P. 
limits is violated. The example considered results in a correct decision in favour of Ue 
hypothesis when n = 7. ay 


Example 2. A practical application in which the randomized block design of px 
be used, occurs when sampling for dust concentration in a rectangular duct (see s 
Sampling probes can be transversely inserted in four positions P,, P,, P, P, and colle? 
of dust made over the width of the duct. It is required to determine whether or not the? jed 


: ; $ mA u 
of gravity on the dust flow is creating stratification from top to bottom. Sampling i$ 6^ 


d at each 


394 Sequential analysis applied to certain experimental designs 
The approximation (4) has not been investigated numerically, except for the case when 


W. D. Ray 395 


out at each of the four positions in random order repeating the scheme after a series of 
fixed time intervals. The accuracy of measurement of dust concentration is known before- 
hand, ` 

In this situation the ‘varieties’ correspond to the four positions and the ‘blocks’ to the 
elements of the time sequence. The hypothesis of stratification is numerically interpreted 
as à particular value of ô and the sampling procedure carried out until a decision is reached 
9n its validity. 

Top 


| Pe 
ne j J n xe Sow P; " 
Gravity | _ E: in MCA DEOR. coll. *~ Probe 


— sni 


€—— à eH 


Fig. 1. Cross-section of rectangular duct. 


Although no general formula for the expected sample number for composite hypotheses 


exists, Bhate (1954) conjectures that the natural generalization of Wald’s formula: 


(1—2)log L- +alog =É if H,is true, 
clog R| N = 6(N)] = i» 
log R | (W)] B 4 (1— )log L =Æ if H, is true, 


Flos 1 5 


Where R j i ful result: 
i ikeli tio, may give useful results. 
Aaaa ae thod described below, the left-hand side of the 


Brus. * first approximation, using the me 
Aion was evaluated as the value of log 


us j d approxima’ 
With the both conjectural and app 


reserib 


t Will be seen that, as in other cases 0 


wet ize i r of one-thi 
sample size is of the order of one | 
n Stituti i i 1 bis), the left-hand side of equation 
(6) ja stituting the appropriate expression for R a " 
&[— 3A logio* 4+ logy M(X, Y; wl, 
a Y=}, w= PG/(1+4). 


X =1}(N-s+9): 
No K 


Ww 
by Taylor’s theorem 
"Ei M(x Y: 
;Y;u) (uet) 2 


ES = ð j MR. logy) M Tes 
loy M(X, Y; 8 (u))+ (uA) Qoo » Loni a Bé en ai 


Cng TERN 
© We have to a first approximation 


9, EXPECTED SAMPLE NUMBER 


— jAlogio* logy) M(X, Y; ó (u)). 


396 Sequential analysis applied. to certain experimental designs 


When J, is true it may be shown that 


A NEN c u gU oni | 
e(2)- x and when H, is true sa) deterior * 


Table 10. Conjectural expected sample sizes (n or k) for the sequential test when Hy, is true 
and corresponding values for the fixed sample size test, when a. = B= 0-05 


(a) One-way classification by groups 


| 
esis gel 8-2 E 
n n á 
k Sequential Fixed Sequential Fixed Sequential Fixed 
uu 
2 10 14 5 8 3 Y^ 
3 8 11 4 6 2 
" 6-7 9 3 5 2 z 
5 6 8 3 4 2 3 
6 4 1 3 4 2 3 
7 4 6 3 4 $ A 
8 2-3 4 s : 
9 2 3 ee 
(b) Randomized blocks E 
8—0-5 8-1 ae 
k k ye 
| 
qd 1 i i ed 
n Sequential Fixed Sequential Fixed Sequential Be 
6 
2 11 15 5-6 9 3 4-5 
3 8 12 4 7 2-3 4 
4 7 10 4 6 2 3 
5 6 8-9 3 5 2 3 
6 5 8 3 4-5 2 
7 5 7 3 4 : 
8 4 6 2-3 4 : | eet 
B 


Thus on the assumption that H is true we have, for æ = fj = 0-05: 
(i) For the one-way diassiflention by groups, to solve for n the equation 


-pend logi9¢ + logs M( (En — 1), 4(6— 1); Me 1) 8} + — 1:15, 


where in reaching this result nk/(nk—1) has been taken as unity. 


W. D. Ray 397 


(ii) For the case of randomized blocks, to solve for the equation 


— Mind logyy e 4- logy M(&(k(n — 1)), $(n— 1); 4nd} = — 1-15. 
Such solutions are those tabulated in Table 10 (a) and (b). Examination of further terms in 
p Taylor expansion reveal that no wide discrepancy results from only considering its first 
erm. 

Several sampling experiments have been carried out and give average sample sizes in 
Very close agreement with those obtained by the above method. 

This conjectural formula of Bhate’s can also be used to give the expected sample size in 
a two-sided sequential t-test. This test is equivalent to a one-way classification by groups 
When % = 2. Arnold (1951) reports the sample sizes at which a decision was reached for each 
of 500 such tests both when the null hypothesis was true and when a particular alternative 
Was true. The ô of Arnold’s paper is, in our terminology, \(36) and his v? corresponds to our 


f| = 0-05 and à = 1. 
ypothesis was true was reported as 10-03, 
2n = 10:0. 


The probabilities of error were taken to be a = 
- The average sample number when the null h 
hile the value given by the conjectural formula was N — 
On the assumption that H, is true we have to determine 


G X. m GA (Y +m) 
m el fa) aud i m! (X +m) 


Pes. a 
or different values of X, Y. 


more amenable form of this expression may be obtained thus: 
Let Op $ OUO m. cay -X)T«1 
Sachs Reem) os 
Where 1 1A (3A)? 


T-x*iuxa) 2-23 
à 
p? qurT- Í " NAGA) = Vas SY» 
0 


an 
d a reduction formula for Vx is 
py = qug e - OC 71) Vae 


Alternatively x= el^ [Gay -rga 4 r(r— 1) Gay? es (—1yrl]. 


Hence P a Sm 


“aye ee 


ted sample size when the alternative hypothesis 


f H 
SN calculations to determine the expec 
hose obtained when Hy, is true. 


118 true gave results almost identical to t 


398 Sequential analysis applied to certain experimental designs 


10. CoMMENTS 


As far as is known there is no proof that tests of this type terminate with guod ks. 
Recently, however, David and Kruskal (1956) have provided such a proo Aer. 
sequential t-test. His proof uses some asymptotic forms for large n of the — s a 
denominator of the likelihood ratio. It may be therefore that the asymptotic or e 
of $6 may be helpful in proving an analogous result for the tests described a a 
It is perhaps worth while noting that other designs in the analysis of yas Beers ee 

the following tables applicable. For example, in a factorial experiment it may. ME. 
to know whether or not first to take two levels of a factor, then three or four un M. | 
hypothesis is accepted or rejected. So too in a hierarchal classification by pee othesii 
of successive subgroups within main groups could be added until some postulated hyp | 

yas accepted or denied. ETT 
i "The Eun of choosing the sequential test appropriate to fit à given pent ae 
altogether easy. In the present case if k (or n) is fixed by the nature of the pe a 
have still a choice of values for the parameters z, / and 6, and in making this choice "i 
well be influenced by the expected sample size.* The tables given in the present e w 
of course, far from complete as far as æ, 2 and 9 are concerned, but it is hoped that th 
give enough information to make some application of theory to practice possible. 


ired 


may 
are, 


„agement 
I would like to thank Dr N. L. Johnson most gratefully for his advice and GE er 
in the preparation of this paper and the Director General of the British Coal 
Research Association for permission to publish it. 


REFERENCES 


n ureau 9 

AnNOLD, K. J. (1951). Introduction to Tables to facilitate sequential t-tests. National B 

Standards, A.S.M. 7. í 7, 334. 4 
Baxer, A. G. (1950). Properties of some tests in sequential analysis. Biometrika, 3 , trika, 39° 144 
BARNARD, G. A. (1952). The frequency justification of certain sequential tests. Biomet 
Buare, D. H. (1954). Ph.D. Thesis, University of London. : 48, 290. 
Cox, D. R. (1952). Sequential tests for composite hypotheses. Proc. Camb. Phil. Soc. decision. 
Davi», H. T. & Knuskar, W. H. (1956). The WAGR sequential ż-test reaches & 2 i 

probability one. Ann. Math. Statist. 27, 797. Statist. 26, 19 | 
Hort, P. G. (1955). Ona sequential test for the general linear hypothesis. Ann. Man P d variano 
Jounson, N. L. (1953). Some notes on the application of sequential methods in the analys | 

Ann. Math. Statist. 24, 614. 


PATNAIK, P. B, (1949). The non-central x? and F distributions and their applicatio! 
36, 202. ia of varian 

PEARSON, E. S. & HanrrEv, H. O. (1951). Charts of the power function for analysis ©: 
derived from the non-central F-distribution. Biometrika, 38, 112. 5. 13, 369- a 

Rusmrox, S. (1954). On the confluent hypergeometric function M(a, y; x). Sankhya, 19, sankhy 


] SER: jon. 
Rusuton, S. & Lana, E. D. (1954). Tables of the confluent hypergeometric distributio 
13, 377. 


jth 


ika 
ns piometrik® 


co test® 


san BY, 

T nnexion of 
he interpretation of ô were discussed in another by unctio? 
pp. 125-9) when considering the use of charts for the pow' 


* Considerations that arise in t 
Pearson & Hartley (1951, 
analysis of variance tests, 


W. D. Ray 


Table 1. One-way classification by groups (true limits) 


k=no. of groups; n —no. within a group; 


A-nkó; 6=0-5: a — fj — 0-05. 


399 


prm 
k=2 Ł=3 k=4 
* à G G n a a G G 
LE | | 
t 4 = 5-390 5 T5 | 0-037 4 8 0-005 | 1-825 
8 6 0-002 | 1-091 7 | 105 072 8 | 12 410 | 0-770 
10 8 025 | 0-639 9 | 135 8 | 16 126 -521 
19 10 -040 -471 11 16:5 0 20 1131 411 
12 -050 +385 13 19:5 2 24 +134 -350 
inu o-oso | osss | 15 | 225 | 0108 14 | 28 0136 | 0310 
few 18 -066 .300 | 17 | 255 . 16 | 32 -138 1282 
39 | 48 -072 76 | 19 | 285 
20 -076 58 | 21 | 316 
30 
30 0-089 | 0-205 
k=5 k=6 k=7 
Im——— 
i à G G n A n À G G 
~ NE Ce 
3 -5 9.407 
T E à 2 6 0-008 3 | 105 | 0-184 | 2-407 
: 12-5 p s “2 \) a 5 | 175 211 | 0-784 
9 17-5 +165 -568 6 18 7 245 :307 512 
u 22.5 -158 +432 8 24 9 31-5 +199 “401 
21-5 -159 360 | 10 | 30 
18 
is | 325 | o159 | 0-317 | 12 | 36 0-176 
37-5 -157 «287 


400 Sequential analysis applied to certain experimental designs 


Table 2. One-way classification by groups. 
(Approximate adjusted values are printed in italics) 


k=no. of groups; n=no. within a group; A=nkd; 8— 1-0; x— /— 0:05. 


k=2 k=3 k=4 
n a G G n | A G G n A & 
4 8 | 0052 2-390 3 | 9 | ore | 4760 2 8 | 0u 
3E 082 | r3s0 | 4 12 169 | 179 3 | T. 306 
2 105 1-016 5 | 15 195 1-174 4 1 2 
7 14 121 | 0-826 6 18 210 | 0902 5 20 TE 
8 | 16 135 | -710 1 21 221 | -762 6 24 "284 
| 
| | 
10 20 | 0-150 0-578 9 27 | 0234 | 8 32 0:287 
12 24 166 -504 11 33 240 10 40 :286 
16 32 187 424 | 13 39 | 244 12 48 25A 
20 | 40 199 | 383 | 15 | 45 | 241 16 64 D 
20 80 2 
30 60 0-215 0-333 2 63 0-251 0:377 
60 120 -235 -300 31 93 -251 | +334 30 120 0:271 
51 153 -251 -306 50 | 200 :266 
= ! — eae 
k=5 k-6 al 
ee e$ 
n a G G n A G | G n A 
| T -" | 
Í 
3 5 0-331 | 2-469 2 12 0-381 | 24-042 3 21 
4 20 345 1:332 3 18 405 | 2133 4 28 
5 25 340 0-973 4 24 -398 1-237 5 35 
6 30 335 -792 5 30 -384 0-020 6 42 
7 35 “331 “687 6 36 -373 -763 7 49 
9 45 0-322 0-565 8 48 0-354 0-601 9 63 
11 55 314 498 10 60 -340 -521 11 77 
5 65 :308 456 12 72 -330 473 13 91 
15 -303 -428 14 84 -322 440 15 105 
2. 5 . 
5 125 0-288 0:360 20 120 0-307 0:386 21 147 
| 
k=8 k=9 
ic A G G n x rei G n A 
; n 0-566 7-500 3 27 0-565 1:684 2 20 
2 “518 1:792 : . 
d x5 bes ve 4 36 510 1-085 4 40 
p 2 5 45 -470 0:845 6 60 
s e 448 0-865 8 80 
8 424 -730 7 63 0-421 0:635 10 100 
9 81 -392 -540 
8 e 
-512 2 
p s -336 -414 15 135 | 0-347 | 0-425 E E 
2 324 +387 21 189 +325 +380 30 300 
| 31 279 . EE 
30 240 0-306 0-348 41 369 4 ps s d£ 
40 400 -288 318 | | 
j 


W. D. Ray 401 


Table 3. One-way classification by groups. 
` (Approximate adjusted values are printed in italics) 


k=no. of groups; n —no. within a group; A=nkd; 8—2-0; x 2 8 —0-05. 


26 


k=2 k=3 k=4 
e A G G n A G ü 
8 0-121 18 0-458 3-361 2 16 0-612 | 22-175 
16 +279 30 497 1-403 4 32 616 1:696 
24 -346 42 507 1:040 6 48 596 1128 
32 -380 54 510 0-891 8 64 583 0-935 
40 -401 66 512 -809 10 80 374 $37 
48 0-416 78 0-512 0:758 12 96 | 0-567 0:777 
64 -434 90 1 24 16 | 128 557 -708 
80 -444 20 160 550 -669 
126 0-510 0-661 
120 | 0-461 186 507 611 | 30 | 240 | 0-040 | 0-619 
160 479 246 505 592 | 40 | 320 535 596 
200 474 306 504 578 50 | 400 532 +582 
a pn 
k=5 k=6 
G a G G 
30 i 24 0-950 — 
: p 48 -765 1:535 
70 -637 72 685 1-081 
90 -613 96 645 0-913 
110 -596 120 620 -825 
-584 144 | 0:602 0-770 
150 pei 192 “578 -706 
240 -563 -669 
jc 360 | 0:542 0-621 
-531 480 530 -598 
510 “525 600 -523 -583 


Biom. 43 


402 


Sequential analysis applied to certain experimental designs 


Table 4. Randomized blocks (true limits) 


k —no. of blocks; 7» — no. of varieties; A— nkó; 0—0:5; a= 2 = 0-05. 


n=2 n=3 n=4 
i n. PCM | G 
k A Gg | à BiA & G k |à a 
| | aa | 
| 
5 5 —  |186-97 4| 6 0-005 | 11-366 3 | 6 0-006 1.635 
T Y 0-027 2-586 6 9 -082 1:637 5 | 10 :120 0-876 
9 9 -061 1:403 $ lie] itz 0-953 7| M +155 bey 
TL 11 -086 1:009 10 | 15 +137 -706 9 18 -168 516 
| 11 22 “175 b: 
13 13 0-105 0-814 12 18 0-148 0-580 0-446 
15 | 15 -120 -698 14 | 21 -155 -503 13 | 26 0-180 “100 
17 | 17 -133 -620 16 | 24 | -160 -451 15 | 30 -182 367 
17 34 :183 343. 
37 | 97 0-170 0-448 22 | 33 0-174 0-365 19 | 38 184 
n=5 n=6 nz 
id , 
= E ü 
k A G G k A G G k | A G 
l pe ae —| 
4 | 10 0-145 2-129 3 9 0-151 4-680 2 7 0-089 1419 
6 | 15 -181 0-914 5 | 15 “211 1-061 4 | 14 1236 0-731 
8 | 20 -194 -627 7 | 21 -218 0:658 6 | 21 :286 .529 
10 | 25 -198 -499 9 | 27 -216 -505 8 | 28 :234 433 
it | 33 -212 -424 10 | 35 :326 
12 | go 0-197 0-426 
14 | 35 196 -380 13 | 39 0-208 0-374 
16 | 40 194 -347 EN 
| 1 
n=8 
k A a G 
3 | 12 0-254 2-627 
5 | 20 -208 0-870 
7 | 28 -254 -573 
9 36 -241 452 


Tabl 


e 5. 


Randomized blocks. (Approximate adjus 


k=no. of blocks; n=no. of varieties; A—nkó: 9—1-0; a=f=0-05. 


ted values are printed in italics) 


n=2 n=3 n=4 
L NN - 
| I 
| Ae. | ES 
E A | G G gk | aA @ a k A| @ | 8 
| ld ma | 
a | 
3 6 0-006 e 2 6 | 0-009 = 3 | 12 | 0-293 6-32 
id 10 -147 | 85383 4 | 42 -236 3-774 5 | 20 -359 Là 
m d -218 2-129 6 | 18 .302 | 1529 7 | 28 373 | 0-987 
9 | 18 -269 1-470 s | 24 +331 1-071 9 | 36 -375 -795 
AE -304 1188 | 10 | 30 346 | 0874 | n | 44 “375 -694 
| | | 
p 26 0-329 1-032 13 | 36 | 0358 | 0-765 | 
5 | 30 -349 | 0-933 | | | 
| b Baal L 
n=6 n=7 
= ip SE E 
k a | @ | ü k A G ü 
| 
3 | 18 | 0:470 2-86 2 | 14 | 0-529 2-748 
5 | 30 | 453 1137 4 | 28 -507 1-412 
7 | 42 | 430 0-812 6 | 42 463 | 0878 
9 | 54 | 412 | “675 | 
| 
| 


— 
Table 6. Randomized blocks. 


k=no. of blocks; n—n0- of varieties; 
n=4 


(Approximate adjusted valu 


A-nkó; 6=2-:0; a — f =0-05. 


es are printed in italics) 


n=2 n=3 
1 i G k À G ü 
1 P G G a 
G G k G 
3 | 24 0-779 4-482 
2 0-488 = 
0-365 z : u 672 3-399 5 | 40 am 186 
-557 4:379 a 36 -716 To 7 S ‘766 139 
15 pr s | 48 -734 1:49 9 2 T5 119 
168 A 10 | 60 -742 1:299 11 | 88 
-756 172 
0:806 1-47 
5 n=6 
n= 
a ü k à a ü 
0-920 | 29-864 3 | 36 0-978 Ar 
+858 2-102 5 | 60 -884 2m 
-810 1-406 7 | 84 is nies 
780 1:166 9 | 108 a 


[ 404 ] 


LOGNORMAL APPROXIMATION TO PRODUCTS AND QUOTIENTS 


By S. R. BROADBENT 
The British Coal Utilization Research Association 


juired to give 


SUMMARY. Measurements have been made which are subject to error, and we aro rec P d 
s 


limits to a combination of the values measured. The combinations considered here are product am 
quotients. Some exact results are available in simple cases, but otherwise an approximat ion is require à 
The lognormal distribution, which is asymptotically exact, is shown to give useful approximations ps 
fitted by moments to the combination. This method of fitting is nearly optimum in a defined metric. 
Tables are given which make its application simple. 


1. INTRODUCTION 
The problem we shall consider is assessing the precision of certain combinations of measure 
ments which have been made with a known distribution of error. Suppose the bere 
measurements v; (i= 1, ...,n) are made, that Z[r;] = 4, and that we are interested in ê 
combination of the x; of the form 


q = (vx.»...m;)[(eja...m,) (Lj <n) 


(more generally, the observations x; may be raised to any power). 
We may be required to set fiducial limits to 


(tta s 15) tia s Hn) r 
i.e. to combine the fiducial distributions of the /; which we have attempted to p 
to set probability limits to q, i.e. to combine the probability distributions of the v; In the 
first case we assume the distributions of x; known, except for the means Jj; and e 
observations x; are given; in the second we assume the distributions of the v; comple 7 
known. The two problems are formally identical,* and we shall henceforward speak “ 
of setting limits to q. Ji of 

To fix the ideas with a particular example of the first case, consider the efficiency 

à steam boiler determined by a single trial in which the heat supplied to and t 
obtained from the boiler are obtained from sample measurements. The efficiency 


caleulated fi 
dnd E = HW( —O)(Qh( —6'), 


he he? 


ay be 


d 
nique aD 
: 
norma 
E 8 
äi " Jaa A s efficient 
rectangular, of known coefficients of variation or half-range, and with small coeffi "m 
any aPP 
and 


js. 4n 


standard deviations or half-ranges can usually be found by simple investigatio! dons ar' 
mati 


applications the errors can only be guessed, and then the roughest approxi! 
appropriate. ie 
«ond 


TS mo „tribut! 
* Weassume that fiducial distributions can be combined in the same way as probability distri 
but see the discussion in Creasy (1954) on this point. 


S. R. BROADBENT 405 


Tt is impossible to list exhaustively all such combinations of errors. Some of the simpler 
combinations can be discussed individually; in the event of a large number of errors being 
combined we can use with confidence the asymptotic distribution. Itisin the termediate 
cases that approximations must be critically considered. The choice of a suitable family of 
approximating distributions will always be more of an art than a science. 

p possible in a particular case to refine approximation to any required degree. Gram- 
rlier or similar series may be used; the saddlepoint method given by Daniels (1954) 

also generates approximating functions. However, in this paper we are concerned with a 

Working or first-order approximation only: all that some data and applications merit. 

It is well known that, even when z is small and the component distributions are far from 
Normal, v+... +8, has a distribution close to normal. When the component variates are 
not added but multiplied and divided, the fundamental approximating distribution is the 
lognormal, as was pointed out, for example, by Shellard (1952). We consider the questions, 
how is this distribution best fitted and when may we use it with confidence? 


2. KNOWN RESULTS 
easurement are independent, and we denote a 
normal variate by N anda rectangular variate by R. The quotient of the standard deviation 
and the mean of N, and the quotient of the half-range and mean of R are both denoted by 
ris %, etc., referring to the variates Xy, Va- in this order. Thus R/N (%,=0-1, à — 0-03) 
hire the quotient of a rectangular variate w nee is no times its half-range and an 
ident normal variate whose coefficient of variation 1s 9 his 
: Forg = NIN Geary's approximation (1930) is used in all practical cases. This states that 
I Q is the quotient of the means of the numerator and denominator, 
F (q-Qlaig+ ae? 
" Approximately normally distributed with zero mean and unit variance. This approxi- 
Mation is very good up to values of a, and a, as large as 0-25 (see Creasy, 1954). The exact 
Percentage points of this distribution may also be calculated using existing tables as was 
Shown by Fieller (1932). The product N x N was discussed by Craig (1936) and Aroian (1947). 
Percentage points of R |N have been given by the author (1954). He has given also points 
P : he quotient of a triangular and a normal variate. 
The distributions of R x R, R/R and so on are not difficult to caleulate exactly. 
Distributions of products, quotients and powers of variates have been extensively 
Studied, but not many results have been obtained that can be applied to practical problems: 


2. 
?-1. We now suppose the errors of m 


3. LoGNORMAL APPROXIMATIONS 


3] EP " 
- The distribution of q = Gt ra ay) 


t HM u 1 Pi H 
ends to the lognormal as n — oo under very general conditions. The lognor mal distribution 


188 been discussed py Finney (1941), Gaddum (1945) and Johnson (1949). 
The most general lognormal ation to q is a variate 2 such that log (z— £) is 
Normally distributed with mean / and variance g?. Here for simplicity we consider the 
n Sice of w and c only, £ being always taken as zero. A variate with this distribution 1s 
®cessarily positive while q may have negative values. When 24,95,... are small, the 
®Pproximation by this lognormal distribution may nevertheless be satisfactory, since the 
Probability attached to such negative values is very small. 


approxim 


406 Lognormal approximation to products and quotients 


3-2. The first method of choosing x and øg is to calculate the moments of log q and to ey 
4 and g? equal to the first and second corrected moments. We call this the method of fitting 
by moments to log q, or the fit to log q; this method is used for estimation by Finney Oe 
It is also intuitively attractive to choose that lognormal approximation whose mean a 
variance are equal to the mean and variance of q (see Wicksell, 1917). We call this E 
method of fitting by moments to q, or the fit to q. The two fits are in general different, Á-— | 
the difference tends to zero as » increases. The difference throws a doubt on the ini M | 
appeal of fitting by moments, and raises the question we consider below, by what criteri 
are we to choose our approximation? -——. 
Let a lognormal distribution have mean m and variance s?, and let the distribution 0 S 
logarithm of a variate from the lognormal have mean x and variance c?. Then it is kno 


the 


that m = exp (4+ 2?[2), 
s? = (exp (2 +0%)} (exp (0?) — à a) 
H = logm— log (1-4 s2/m3), 
c? = log (1+8?/m2). | 
For the fit to log q, we must find the mean and variance of log q, and set 4 and e? equal 9 
them. The t% point of this fit is (2) 


exp (“+7,0), 


where the probability is ¢/100 that a standardized normal variate is less than Pr hem- 

For the fit to q, we must find the mean and variance of q, and set m and s? equal to 
The £ % point of this fit is (3) 
(mexp [pílog (1 + y*)}4])/(1 + v), | 


l 
where 100v = 100s/m, the coefficient of variation of q. This t% point is tabulated in Té 
form = 1,1002 = 0 (0-01) 15 and ¢=1, 5, 95 and 99. To use this table it is necessary E an! 
the mean and coefficient of variation of q, to enter the table with the appropriate a 
multiply the value in the table by m. The table was calculated from a series expansion ers 

9-3. To find the moments of q or of logg we require the moments of various P 
(positive and negative) of x or of log x, when x is normally and when {w is mo ori 
distributed. We write z = /(1+ay), where y is either a standardized normal Vah 
uniformly distributed between —1 and 1, and « is small (less than 0-15). We Pee ince 
finding the moments of log(1+ay), not when y is normally distributed. but 
Pr{l-+oy <0} = c is less than 1079) when y is from the truncated distribution (4) 


exp (— dy") dy/{(1—e) /(277)} (y> —1/a), ica! 
which is for practical purposes indistinguishable from the normal. We avoid mathe 3 
difficulties by taking this distribution as the parent. Alternatively, the treatment 0 
vergence given by Derksen (1939) could be used. ritte” 

The moment-generating function of log (1 + æy) is E[(14-o)'], and this may be w 

lla 
Tem may |, tmr itti 1) ay? + exp (— gf) dy +K, 

where | K | « e/(1— e). 


Let J(æ,r) = xL. y'exp (— by?) dy 
= (1—I[(1/(222), 3(r— 1) BPA + 1)]/ rr. 


VS 


S. R. BROADBENT 407 


pu 
Here I(y, p) is the Incomplete Gamma ratio | v? e dv|T(p +1), and is nearly one when y 
0 


is large and p small; we deduce from Pearson’s table (1922) that for x « 0-15 and r< 6, 
J (&, r) < 10-9, and we therefore neglect it in this region. 

The series above is uniformly convergent within the range of integration, so that term-by- 
term integration is permissible. We may replace the limits of integration by (— oo. oo) in 
the first four non-zero terms when o « 0-15, since the error so introduced is a factor of 
(1 — 2J (a, r)/(1—6)) for r=0, 2, 4 and 6. As K is negligible we obtain to sufficient accuracy 
the first four terms of the convergent series for the moment-generating function 


1 co?tit(it — 1)/2-+ acit(it — 1) (it — 2) (it — 3)/8 + atit .... (it— 5/48, 


and hence the cumulants of log (14 xy); 


n =k = —a?/2—3a8/4—5a8/2—...,) 
[lg = Ka = a? + 5a4/2+3205/3+...; g 
Kg = — 3a — 2225 — ..., (5) 


K4 = 20x8 4 .... 


3 The eumulants of log (L+ay) when y is rectangularly distributed in ( — 1, 1) are obtained 
™ à similar way. We have 
M =K = —a[6 — aà|20 — [42 — ..., 
Jig = Ky = a? [8 + To]48 + 2925[315 +..., 
ky = — 208/15 — 15134/3780 — ..., 
key = — 22415 — 1648/315 — .... 

We next require the expectation of various powers of (1-- ay) when y is normally dis- 
tributed and when y is rectangularly distributed. These have been calculated for the trun- 
cated normal distribution (4) and the rectangular distribution; the arguments are similar 
to those given above and are not repeated here. Finally, we have arranged the results 
commonly required in Table 2. These formulae, with (2) or Table 1, enable the percentage 
Points of the two lognormal fits to q to be calculated. 

The power series above have been formed by identifying coefficients in the expansion of 

S Inoment-generating function. We have given only the first four terms of this expansion, 

9r moderate æ. For later terms or larger c, important corrections would have to be applied 

to the coefficients obtained by uncritically continuing the series. Indeed, Wicksell (1921) 
Pointed out that some of the series in (5) and (6) may diverge if the later coefficients are 
“Neritically formed. The simple rules for obtaining these coefficients apply only under the 
Conditions stated. 

The results given may be extended c : 

is necessary only to find the moments for correlated variates in the same way as above; 
Some moments are given by Haldane (1942). 

As an example, consider the lognormal approximations to q = NV |N; by convention the 
Coefficients of variation of numerator and denominator are respectively 100%, and 1002;. 


(6) 


to cases in which the variates are not independent. 


Sing Table 2 we obtain Eg] =m = (D à +a 4- 3o - ...), 
and Bige] = (+29) (1+ 3a§ + 15d.) 
P Vig] = s = aĝ +a? + Badd + 8d 4... 


408 


Lognormal approximation to products and quotients 


Table 1. Standardized lognormal percentage points 


qx ; al 
Given the mean m and standard deviation s, let » — 1005/7. The percentage point of the lognorm: 
distribution with this mean and standard deviation is the entry in the table, multiplied by m. 


Lower 1% points (t= 1) 


v 0-0 0-1 0:2 0-3 0-4 0-5 0-6 0-7 0-8 0-9 
: | 
i 
0 | 1-0000 | 9977 | 9954 | 9930 | 9907 | 9884 | 9861 | 9838 | 9815 | 9792 
1 | 09770 | 9747 | 9724 | 9701 | 9679 | 9656 | 9633 | 9588 | 9566 
2 | 0-9544 | 9521 | 9499 | 9477 | 9454 | 9432 | 9410 9366 | 9344 
3 | 0-9322 | 9300 | 9278 | 9256 | 9234 | 9213 | 9191 9148 9126 
4 | 0-9105 | 9083 | 9061 | 9040 | 9019 | 8997 | 8976 8933 | 8912 
5 | 0-8892 | 8870 | 8849 8828 8807 8787 | 8766 8724 | 8703 
| 
I 
6 | 0-8683 | 8662 | 8641 | 8621 | 8600 | S580 | S559 | 8539 | S519 | 8498 
7 | 0-8478 | 8458 | 8438 | 8418 | 8398 | 8378 | 8357 | 8338 | 8318 | 8298 
8 | 0-8278 | 8258 | 8238 | 8219 | 8199 | 8179 | 8160 | 8140 | 8121 | 8101 
9 | 0-8082 | 8062 | 8043 | S024 | S004 | 7985 | 7966 7947 | 7928 | 7909 
10 | 0.7890 | 7871 | 7852 | 7833 | 7814 | 7795 | 7776 | 7758 | 7739 | 7720 
11 | 0.7702 | 7683 | 7665 | 7646 | 7628 | 7609 | 7591 | 7572 | 7554 | 7536 
12 | 0:7518 | 7500 | 7481 | 7462 | 7445 | 7427 | 7409 | 7391 | 7374 | 7356 
13 | 0-7338 | 7320 | 7302 | 7285 | 7267 | 7249 | 7232 | 7214 | 7197 | 7179 
14 | 0.7162 | 7144 | 7127 | 7110 | 7093 | 7075 | 7058 | 7041 | 7024 | 7007 
15 | 0-6990 — = = = = ica is citi — 
Lower 5% points (£— 5) 
] 
v 0-0 0-1 0-2 0-3 0-4 0-5 0-6 0-7 0-8 0-9 
| | 
0 | 1-0000 | 9983 | 9967 ` 9951 | 9934 | 9918 | 9902 | 9885 | 9869 | 9853 
1 | 0.9836 | 9820 | 9804 | 9788 | 9771 | 9755 | 9739 | 9723 | 9707 | 9691 
2 0.9675 | 9658 | 9642 | 9626 | 9610 | 9594 | 9578 | 9562 | 9546 9530 
3 | 0.9514 | 9498 | 9482 | 9467 | 9451 | 9435 | 9419 | 9403 | 9388 9372 
4 | 09356 | 9340 | 9324 | 9309 | 9293 | 9278 | 9262 | 9246 | 9231 | 9215 
5 | 0.9200 | 9184 | 9168 | 9153 | 9137 | 9122 | 9106 | 9091 | 9075 | 9060 
6 | 0.9045 | 9029 | 9014 | 8999 | 8983 | 8968 | 8953 | 8937 | 8922 8907 
7 | 08892 | 8877 | 8862 | 8846 | S831 | 8816 | 8801 | 8786 | 8771 | 8756 
8 | 0-8741 | 8726 | 8711 | 8696 | S681 | s666 | 8651 | 8636 | 8622 | 8607 
e 0-8592 | 8577 | 8562 | 8548 | 8533 | S518 | 8503 | S489 | 8474 uw 
0 | 0-8445 | 8430 | 8415 | 8401 | 8386 | 8372 | 8357 | 8343 | 8328 | 831 
5 0:8299 | 8285 | 8271 | 8256 | 8242 | 8228 | 8213 | 8199 | 8185 8170 
is Da6 8142 | 8128 | 8113 | 8099 | 8085 | 8071 | 8057 | 8043 ore 
a 0.8015 | 8001 | 7987 | 7973 | 7959 | 7945 | 7931 | 7917 | 7903 | 788° 
0-7875 | 7861 | 7848 | 7834 | 7820 | 7806 | 7793 | 7779 | 7765 | 7751 
15 | 0-7738 es 


Table 1 (cont.) 


Upper 5% points (t= 95) 


S. R. BROADBENT 409 


x 0-0 0-1 0:2 03 | 04 0-5 0-6 0-7 08 09 A 
LI. ' | 
| | 
9 | 1-0000 | oo16 | 0033 | 0049 | 0066 | 0083 | 0099 | 0115 0132 | 0149 | +17 
l | 1:0165 | o182 | o198 | 0215 | 0232 | 0249 | 0265 | 0282 | 0299 | 0316 | +17 
: 1:0332 | 0349 | 0366 | 0383 | 0400 | 0417 | 0433 | 0450 | 0467 | 0484 +17 
4 1-0501 | 0518 | 0535 | 0552 | 0569 | 0586 0603 | 0620 | 0637 | 0654 | +17 
10671 | ogg | 0705 | 0723 | 0740 | 0757 | 0774 0791 | 0809 | 0826 | +17 
5 | 1.0843 | oseo | os7s | 0895 | 0912 | 0930 | 0947 0964 | 0982 | 0999 | +17 
f 131017 | 1034 | 1051 | 1069 | 1086 | 1104 1121 | 1139 | 1156 | 1174 | +17 
11191 | 1209 | 1226 | 1244 | 1262 | 1279 | 1297 1315 | 1332 | 1350 | +18 
8 | 11368 | 1385 | 1403 | 1421 | 1439 | 1456 | 1474 | 1492 | 1510 | 1528 | +18 
ie 11545 | 1503 | 1581 | 1599 | 1617 | 1635 | 1653 | 1671 1689 | 1707 | +18 
11795 | 1742 | 1760 | 1779 | 1797 | 1815 1833 | 1851 | 1869 | 1887 | +18 
| | 
i: 11905 | 1923 | 1941 | 1959 | 1978 | 1996 2014 2 | 2050 2009 | +18 
13 12087 | 2105 | 2123 | 2142 | 2160 | 2178 2196 | 2215 | 2233 | 2252 | +18 
m 1:2270 | 2288 | 2307 | 2325 | 2343 | 2362 2380 | 2399 | 2417 | 2436 | +18 
is | 12454 | 2472 | 2491 2510 | 2528 | 2547 | 2565 | 2584 2602 | 2621 | +19 
1-2639 = a ME "e = = = = = an 
Upper 1% points ((— 99) 
Bi as | og | oz | os | oa | 95 | $9 | "9 | 08 09 A 
poe g | 
| | 
? 1-0000 | 0023 | 0047 | oozo | 0094 | 0117 | 0141 | 0164 | 0188 | 0211 | 4-23 
2 | 10235 | 0259 | 0282 0306 | 0330 | 0354 | 0378 | 0402 0426 | 0450 | +24 
3 | L0474 | 0408 | 0522 0547 | 0571 | 0595 | 0620 | 0644 | 0669 | 0693 | +24 
h a | FOUS | 0743 | 0767 | 0792 | 0817 | 0841 | 0866 | 0891 0916 | 0941 THE 
| 5 | 10966 | o991 | 1016 1041 | 1067 | 1092 | 1117 | 1142 | 1168 | 1193 | +26 
14219 | 1244 | 1279 | 1296 | 1321 | 1347 e1372 | 1398 | 1424 | 1450 4-26 
; 11476 | 1502 | 1528 | 1554 | 1580 | 1606 1632 | 1659 | 1685 mi M 
8 1-1738 | 1764 | 1790 | 1817 | 1843 1870 | 1896 | 1923 1950 d Mn 
9 | 12004 | 2030 | 2057 | 2084 | 2111 | 2138 | 2165 | 2193 2220 2247 pes 
io | L3274 | 2302 | 2329 2356 | 2384 | 2411 | 2439 2466 2494 | 252 jr 
| 1-2549 | 2577 | 2605 | 2633 | 2661 | 2689 2716 | 2745 | 2778 | 2801 | + 
| | A D 
ù 12899 | 2857 | 2885 | 2914 | 2942 | 2970 3027 | 3056 m + 
13 | b3113 | 3141 | 3170 | 3199 | 3228 3257 3314 | 3343 ar T5 
14 | 13402 | 3431 | 3460 | 3489 3518 | 3547 3606 3836 ied +29 
| 15 raone 3724 | 3753 | 3785 | 3813 | 3843 3902 | 3932 | 3962 +30 
“3992 "m ust = — ES = "E i = 


410 
Similarly, 


and 


Using these values of x and g? we obtain for the first two moments of the lognorm 


to log q, 


V[log q] 


mean: 


Blogg] = n =- dad dd) Q 


g? 


variance: 


(e - Sad 4 ...)4 


l-c2o$cdei iai... 


Lognormal approximation to products and quotients 


iE 


al fitted 


ai +a? 3a 3aio xb... 


: P : ifference 
Since these agree with m and s? to O(a}, 3), and to higher order when a = 2s, the differe 


between the two fits will be small. 


Table 2 
3i 
s | y is distributed normally with | y is distributed rectangular 1 | 
mean zero and variance one in the interval (— 1 | 
j ae 
BL ayy] i 1—a?/8— 150/128 — ... ie ek 
ib 1 1 
2 l+a? 14-a*[3 
4 14-622 4 32 142024 4/5 
-4 14-3a*/8 + 1054/1284... 1+a?/8 + 708/128 ++ 
zi 1l4-a3| 4-30 4... 1422/3 0/6 
-2 14-32? + 15234... 1/(1— a?) 
—4 14-102? 4- 105a 4... 1+ 1022/3 t Tat ++" 
we _———| 
= "E Mp 
Eplog (1--ay'] —r(a*]2 4- 30/44...) — r(ot[6-- a4[20-4 7) 
Vlog (1--«yy] r*(a + 5at/2+ ...) r*(a*[3 + 708/45+ ---) 
H H . y s 
It is not possible to state how different the two will be in general. We compare e£ 
percentage points given by the two fits with some exact values (Table 3). The gr from 
are made for quite large coefficients of variation, and for distributions rather dijeron p n- 
lognormal. The agreement is surprisingly good (it would be worse at more extreme P xact 


: e 
tage points). The two fits generally give points closer to each other than to the 
value, i.e. choice between the two methods does not appear to be important. is 


. if 
3:4. We turn now to the normal approximation. Cramér (1951) has shown that du H 
a function of the central moments of a multi-dimensional sample of size S, SU6^ ^ yy js 


and its first two derivatives with respect to these moments are continuous then 


G toti | 
asymptotically normal. If H(m,, ...) is this function, its mean and variance are asy™ | 
ally H(w,,...), and ames am /0H 

Ha(m,) (=) +... + 2fy3(m,m,) (=— Jz— | +-- 
Of, Opty) \O fa ih 
N i ith 8 7 
OW q = (2,25...25)[(z;,1...2,) is in the form required by this theorem, wi 


It is sometimes said that q is approximately normally distributed, with mean l 


(Hata «++ s) ta -+ Hn) 


and variance 2 (27)? 9g  ( 9q 
a(z) T. 20190405 Th Te, Teen 


that is, with coefficient of variation 


100(oj +... Tob... t 20190506 + ... — 2g 544. Nyy — we} | 


ig a 
. “Pproximately normally distribute 


S. R. BROADBENT 411 


om and variance of q agree with these formulae to O(a?) and O(«*) respectively when 
am v om normally distributed, and £o do not differ greatly for small æ. Now the normal 
Ped dist ributions do not differ greatly in their l; 5, 95 and 99 % points for small 
tion mil. > i variation. Ifthe lognormal approximation to q ìs good, thenormalapproxima- 
a aso be good for small g. This appears to bea better justification for the normal 
B m in such cages than reliance on the application of Cramér’s asymptotic 

1 to a sample of size one. Brunt (1931) has derived this approximation by a Taylor 


ex ` E : 
Pansion of q and the assumption of normality. 


4. GENERAL USE OF THE APPROXIMATION 


Ea is necessary to know how useful the lognormal approximation is in practice. 
usual] E exact distributions for which we require approximate percentage points are not 
on it is impossible to give exact results. Equally, in many problems it is as 
5 peace to suppose the variates are lognormally distributed as to consider them normally 
peat, and then the question is trivial. 
en Table 3 some typical comparisons with simple distributions are given. It is intui- 
E clear that the approximation will improve as the a decrease or as the number of 
A pes variates increases. If we are satisfied by the agreement indicated in Table 3, 
B ay use the method with confidence in more complicated situations with smaller a. 
in d m the number of variates increases, larger coefficients of variation than those given 
Don. e 3 may be allowed without impairing the approximation. It is interesting to know 
B ds uch larger the coefficients of variation may be. Some quantitative conclusions may 
variates from the cumulants of log 4, where q is the product of n independent normal 
S : q = yta e+ Ty: 
Eie me constant 6/4» -+ Hn has already. been WU so that x; has mean one and 
€ a: suppose also that the lognormal approximation to q has been deemed satis- 
as mean one and variance 


e; s ' ] 
pa tory . We now wish to approximate tog = (n+ where 2,4; D 
: The lognormal approximation to q' will remain satisfactory for small /; for large f the 


stripy: Nino. à 
Stribution of q' will be unduly affected by % 41 and the lognormal approximation will no 
nger Satisfy us. The problem js to determine conditions on £, in relation to the æ; which 

OW us to use with confidence a new lognormal approximation. These conditions cannot 


en é s 
Sure that the new approximation is from every point of view as good as the old, for 
ample, the exact and approximate cumulative distributions will not generally coincide 


t Hr where they coincided before. 
Ows from (5) that the fit to log g supposes that 
u = (logg — Isi 
d with zero mean an 
Ky = —A2— 3u/4—5v/2, 


ar 
ad Kg = À+ 5j,[2 + 32v[3. 


Here 


d unit variance, where 


n n n 6 
A-Xeh H= Sat and v= P 
Th p C ği 

© third and fourth cumulants of u are approximately 


Kg 3u[À and Ky = 20v/A?, 


Wn k 
d are of the order of 3a/n? and 20a |[n respectively. 


412 Lognormal approximation to products and quotients 


For q' similar relations hold, with 
X -ARfB, #=u+ and v = v ff. 

Tf the new third cumulant is less than or equal to &; in absolute value, and the new fourth 
cumulant less than or equal to Ky, we have grounds for believing that the normal approxima 
tion to logq' and the lognormal approximation to q’ will be at least as satisfactory 25 the 
approximations to q. The condition on the third cumulant is approximately 


Aie Y < 1+ BP 


Table 3. Exact and lognormal approximation percentage points 


7 
Distribution = R N RIR mw | wxw | ND 
— — P | mE ———( mel 
| 
1002; ... 20 10 | 10 | 10 4 8 
LnL— oi b E — 2 » -— el ee a i) 
| | 5 
100z, ... — xs d 5 4 
—————— o — -—— SS —" -s 9 L|LT————— NET | 
.848 
1% | Exact | osoa | 0767 | 0842 | 0-846 | 0877 pe 
| Fit tog 0-760 0-789 0-827 0-837 0:875 0-848 
Fit to log q 0-758 0-786 0-827 0-836 0-877 
_ | - — d ber 
m .890 
596 Exact 0-820 0.8386 | 0872 0-882 0-908" #390 
Fit to q | 0-822 (845 | 0-874 0-881 0-910 0-390 
Fit to log q 0-820 0-842 | 0-874 0-881 0-910 
| ws | 
R 1124 
95% Exact 1-180 1:164 1:147 1-133 1:095* 1424 
Fit to q 1:200 1-173 1:144 1:134 A 1124 
Fit to log q 1:203 1-175 1:144 1:134 T 
we nem 
"C = 4480 
99% Exact 1:196 1:233 1-188 1-185 1-135* Lar 
Fit to q 1-298 1-255 1-209 1-194 1-139 1-180 
Fit to log q 1:303 1-259 1-210 1-195 1140 
| 


* Exact points communicated by Prof. L. A. Aroian. 


We write b = f/?/A and y = A?[p; then the condition on bis 
R by? —b?+ (2y —3)b—3 <0. TC 
t — from Cauchy's inequality that 1<y<n. The condition is satisfied 9' 
unti 
E y = {(b + 1)? — 1j/D?. 
Similarly, the condition on the fourth cumulant is approximately 


: Av + f) < (A+ P? v. , ditio? 
We write ô = A%/v; it follows from Cauchy's inequality that 1<0< n?, The c° 
becomes HP 


and is satisfied at b — 0 and until 
b = (14-80) + 13/20. 


S. R. BROADBENT 413 


No mog be inp when the x, are raised to positive or negative powers, and 
MN res ot > ipo When q consists of the product of independent rectangular variates 
NS ha oth x, and K4 are O(a"), and hence that the conditions on both these 
E feces in an acceptable region similar to the first of the two defined above. 
41 vani practical scuninsions are now drawn. 
lesion ree of variation equal. Suppose a, = ... = Xn =% and z,,, has coefficient of 
if Bis) . Then the lognormal fit to the new product will be as good as the fit to the old 
ess than the value given in Table 4, n = 1 (1) 6; the asymptotic criterion is also shown. 


Table 4. Critical values of B for à, = ... =% = % 
n | Criterion Kg | Ky 
| 

1 | 1-462 | yale 
2 | 1:38 | 1300 
3 | 1:322 |  r26x 
4 | 1-292 |  r24x 
5 | 1:26x | 1-240 
6 | 1-23% | 1-232 
co 1-224 | 119% 


Table 5. Critical values of p. &i unequal 


n Ae | [^ | Criterion Ks | Ky | 
eee! | | 

2 | a2 | — | 1-282 | 1-29x 

2 | a5 | 1-40% | 1-39% 

3 | a/2 a/2 1-27a | 1-25% 

3 | a/5 | ald | 1-2la | 1-362 


Co, 
A effici Ww : iis "T 
, ficients of variation uneg ual. For various values of a, and 2. Table 5 gives the critical 


alue : 
I " of f in terms of a, = &- 
S n E n a8 
clear from Tables 4 and 5 that when a new component is added to q and it is known 


at t 
he lognormal approximation to q is satisfactory, the coefficient of variation of the 
rgest coefficient of variation already 


ew x 
pun can be of the order of 1-25 times the la n 
tisfa t When this is the case, we have some confidence that the new q will also have à 
e = ory lognormal approximation. In this way we may extend the results of Table 3. 
satista for example, conclude that it is likely that the lognormal approximations are 
s P nd for RN|N, a, <l, %2 < 0-06 and & € 0-05, and for RN[NN, o, «01, %&< 0-06, 
and a, < 0-05. 


5. THE DISTANCE BETWEEN DISTRIBUTIONS 


531 

pro We have already pointed out that there are several methods of choosing a lognormal 
06 ximation to q, and we have given examples to show that the fit by moments to q gives 
Whic Practical results. It is possible to give this method some theoretical justification, 


of course rests on the criterion adopted for judging an approximation. 


414 Lognormal approximation to products and quotients 


Suppose that we are given the cumulative distribution function F(x), for which the 


first and second moments jz and g? exist. Suppose also the cumulative distribution G(z) 
has moments m and s?, and that we are to choose from some family that G(x) which most 
resembles F(x). For example, G(x) may be a well-tabulated distribution, and we are at 
liberty to choose the parameters m and s?. The manner in which G(x) is to resemble F(x) 18 
of course critical, and the number of possible criteria is unbounded. We develop first pus 
almost trivial criteria and then discuss two approaches to the distance between F (x) 
and G(x). 

5-2. In the first place, suppose we really require two percentage points of F( 
type of problem we stated initially. Suppose two parameters of G(x) are available, an od 
particular values of these parameters the two percentage points of G(x) coincide with those 
of F(x). These are the values we should choose. For example, in fitting a normal distribution 
to the 5 and 95% of R/N, a, = 0-1, æg = 0-05, we have to solve 

m—1-645s = 0-882, m+1-645s = 1-133, 


x); this is the 


z ; ‘ rere 
to obtain a unique solution. This method cannot be usefully applied, since if the points W 


known we would not require to fit G(x). 

Now suppose it is known that a good fit will be obtained by selecting a 
for agreement. For example, the author (1954) gives the 1 and 5 % points of RIN; eu 
the 24 % point is required. The lognormal distribution is a reasonable fit to this distribu ; 
we therefore fit a normal distribution to the logarithms of the 1 and 5% points Pı and Ps 


set of know? points 
ppose 
tion; 


We solve m—2-326s = logp,, m—1-645s = log ps. 
The 2}% point, Pəs is then given by 

m — 1:960s = 0-462 log p, + 0-538 log ps = log pos. p 
given. Th 


Calculations show that this interpolation is within the accuracy of the tables his pape 
this 


method is also exact when applied to interpolation or extrapolation in Table 1 of 


a 8 
5-3. The distance between two cumulative distribution functions at the bel. phe 
naturally defined as some monotonic increasing function, M(z), of | #(@) - aq! 
distance between the two functions may then be defined by either 
[an ren ec vcre are. 
or SEP UM] | F(x) — G(x) |) VC ())], 
: E B ér 1928 
where (z) is some weighting function. The first definition is associated with Cramer ( 


von Mises (1931) and Smirnov (1936); the second with Kolmogorov (1933). " phat 

I We are concerned with distances along the x-axis rather than perpendicular he i “athe 

is, we aim to fit approximate percentage points which are close to the true pom 

than to fix points at which F(x) is close to the desired value. ) and 
Suppose that F(x) = p and G(x) = p have respectively the unique inverses vZ pP of 

x = y(p) for almost all p. We are concerned with a distance between the function? pi ihe 

M(| $(p) —7(p)| ), and we require definitions of distance between the distribution? 


form 1 
f 2ém- y yoa, 
or mmm UM (| olip) Y Go) DV (Cp)]- 


»v-————nÓ€ 


| 
| 


S. R. BROADBENT 415 


We do not here consider the second definition, which gives an unbounded distance for 


certain very simple cases, e.g. if F(x) and G(x) are normal distributions differing only in 


Variance, M(z) = z and (p) = 1. 
eee oen of the first definition for our purpose is given by M(z) — 22, and (p) the 
m o function of some set E contained in the interval (0, 1) in which we are specially 
o ig For example, E might be the interval (0, 1) itself, or the union of the intervals 
; 0-05) and (0-95, 0:995). 
We therefore define the distance 0 between F(z) and G(x) by 


0 - [ 6&9 - vt dpl| E 
JE 


æ) is (r— 4/05 let £(p) be theinverseof the 


The standardized form of x corresponding to F( 
dized form and 7(p) the similar function 


distribut; 
'Stribution function corresponding to the standar 
erived from G (ee): 

(p) = K+ o&(). xP) = m-+sy(P)- (7) 


Tt follows that 


| E | 0= (u -— m)? + 20(4 = ZED dp = 2s(4 x ZR dp * [trio = sn (p)}? dp. (8) 


sts of all points in (0,1) except 


Consiq ‘ i 
: er tl ; = 1, that is, E const 
(by for Wo pai a a instance, We would exclude values of p for which 
D) aj i j 
nd (p) are not unique. Here 
9 = (i - m** g?— 281 * 9. 
Wh, 1 
Cre I -Í zip) alp) dp- 
0 
b, £= 1, and 


Gere ; S 
prar G(x) differs from F(x) in mean and variance at mo: 


g= (i mto 
ji and s = Ic, when 0 takes 


ig is 
Te = = and s=0. 
duced to zero by choosing ™” Di d choose m — 
þe chosen we should make 


the 3 Lother cases 7 < 1. To minimize f W° pat variance can 
Our oh Ye o?(1— 72), If more than the mean ? 


Slee toxmaxinize L orrect in that it sets m = /, 


oments is ¢ 


3 m 3 
but i this criterion the method of fitting ale) gives g = 2031 —D- The optimum pro- 
The met en g accordingly- However, the method 


P Optimum in taking $ = 7 : were iieri EM 
S 00 i standardized forms o 
N fias calculate I and then to eee jf I is near one: i.e. if Ae noche Rd 

%) and 8 will not be far from op* alae where € 18 : . “ge 
Proce G(x) are not too different- Supp? nts 7 is only slightly greater: 0 = 20%. 
t thod e method of fitting by moments is 
hich we require approxi- 


z for w. 
be cà culated in a ea ere the y?-distribution has 30 d.£., 
4 ) = 0:05, we 


yep sation ton (2X } ub 
We ah 8. Butin fitting an ential distribution voibution even to R|R, % 0-1, 2; 
9h, . “An To. : normal dis 

ta: 0-995, In fitting 2 


48 large as 0-92. 


416 Lognormal approximation to products and quotients 


The more general case, |E | <1, will not be discussed here except to point out that the 
procedure of minimizing 0 is very similar. If we write (8) as : 
| B| 0 = ("—m)?+2("—m) (Uo — Vs)--o?X —2osY 4 s?Z, 


we see we must choose m -—pu-(VY-UZ)ej( V2—Z);) (9) 
s=(UV-Y)o((V?-Z), J 


provided G(x) is non-singular. 
If, for example, F(x), G(x) and E are symmetrical about their means, (9) reduces to 


m-q and s= Yo[Z. 


5-4. Although the preceding section may be regarded as a refinement and in e. 
justification of the method of fitting by moments, it cannot be applied as it stands to lr 
lognormal approximation to q. For the lognormal, and for a wide class of distributions; w 
standard form z(p) of $5:3 does not exist. It exists only if the family G(x) is linear. 
product and quotient of standard forms are not a linear family, nor can they be pao) 
into a linear family. This is evident from the fact that such a product, say (4 +bx) (0+? 
or quotient has three parameters (ac, b/a and d/c), while a linear family has only bid im 

Although we cannot fit a best lognormal to a general product or quotient of variates; 
argument above still shows that the method of fitting by. moments to q is nearly © 
The definitions of £( p) and 7(p) must now follow from (7). r 

The method of fitting to log q is nearly optimum in another metric, in which the di 
between F(a) and G(x) is 


ptimu™ 


stance? 


1 
[ tos toGnrvaniap. 


6. CONCLUSIONS 


We return to the problem of setting limits to 


q = (s... E) (tjt -+ m). 

There is no single answer to the question, what approximation should we use t 
good limits? We mustuse judgement in selecting the known distribution used in t 
tion: normal, lognormal, or one of those described in $2. When the family is Lo 
method of fitting by moments to q is recommended. It is recommended only ben 
metric in which it is nearly optimum is the more natural; in practice there is little di pH 
between the two approximations. Given Tables 1 and 2, the fit to q is easily ni 
without Table 1, the fit to logq is simpler. ; 

If the number of component variates is larger than two, the lognormal approx? 
will give satisfactory results when the coefficients of variation of the components 2^ f the 
and not too different. It is therefore a suitable approximation for general use jmilat 
coefficient of variation of q is sufficiently small, the normal approximation gives very? 
results. jati 
To calculate the first two moments of q it is necessary to know the coefficient ofan” ive’ 
or quotient of half-range and mean, of each component, and then to combine the values tion 
in Table 2. The percentage points are then given in Table 1, interpolation and oxtrap t; plied 
for other percentage points being discussed in §5-2. Finally, these points are to be = 


by 


o calculate 
he calcul? 
gen, 


patio” 
m 


(ttt ... 5) s a safn) 


S. R. BroaDBENT E 417 


The example we gave in § 1 may now be completed. Suppose, for example, E = NRN/NN 

(ie. W is measured with rectangular error, the other components with normal error) and 

*» % = 0:02, à, = 0:01, as = 0-005, a, = 0-015, æ; = 0-005 and a, = 0-005. We find the mean 

and coefficient of variation of E are 1-000 275 and 2-57 %. 

The 1 and 99 °% limits to E are 0-941 and 1-061 times its calculated value, i.e. the efficiency 

has been determined plus or minus about 6 95. The normal approximation gives very miam 
limits: 0-937 and 1-063. 

1f, however, dy were large in relation to 24. 


t -— 
he lognormal as an approximation, and to o 


25, etc., we might prefer to use R|N rather than 
btain percentage points from Broadbent (1954). 


The author thanks Prof. G. A. Barnard for his encouragment and assistance, Prof. 
work for publication, and the Director General 


E. S. Pearson for his help in preparing the 
ation for permission to publish it. 


9f the British Coal Utilization Research Associ 


REFERENCES 
Aroan, L. A. (1947). The probability function of the product of two normally distributed variables. 
Ann. Math. Statist. 18, 265. 
ADBENT, S, R. (1954). The quotient of a rec 
Br Metrika, 41, 330. 3 . X 
UNT, D. (1931). The Combination of Observations, 2nd ed. Cambridge University Press. 
Ann. Math. Statist. 7, 1. 


ae C. C. (1936). On the frequency function of zy. 
a En, H. (1928). On the composition of elementary errors. Skand. AktuarTidskr. 11, 13. 
iR, H. (1951). Mathematical Methods of Statistics. Princeton University Press. 

BASY, M, A, (1954). Limits for the ratio of means. J. R. Statist. Soc. B, 16, 186. 


ANIELS, H, E. (1954). Saddlepoint approximations in statistics. Ann. Math. Statist. 25, 631. 
ERKsEN, J, B. D. (1939). On some infinite series introduced by Tschuprow. Ann. Math. Statist. 


Bro tangular or triangular and a general variate. Bio- 


» 380. 
Ferter, E. C. (1932). The distribution of the index in a normal bivariate population. Biometrika, 
STA 2). 
í hm is normally distributed. 


F H H 
WNEY, D, J. (1941). On the distribution of a variate whose logarit 

ap 2B: Statist, Soc. Suppl. 7, 189... 

n DUM, J. H. (1945). Log-normal distributions. ; 
any, R. Q, (1930). The frequency distribution of the quotient of two 


Hig A, 93, 442. nh 
DANE, J. B, S. (1942). Moments of the distri 


Nature, Lond., 156, 463. 
normal variates. J. R. Statist. 


butions of powers and products of normal variates. 


j Biometrik 2 
| s HENSON, N. T ua. — of frequency curves generated by methods of translation. Biometrika, 
36,149, à 
a i una legge di distributione. G. Ist. ital. 


MU A. (1933). Sulla determinazione empirica d 


Attuari, 4, 83. 
th K. (1922). Tables of the Incon 
LARD, G. D. (1952). Estimating th 
Sy, s. 47, 216. 
Nov, N. (1936). Sur la distribution de w’. 


ON M inli itsrech ng. Vienna: Deuticke. 
Ww ISES, R. (1931). Wahrscheinlichkeitsrec hnung. i à - 
CRSELL, S, a a On the genetic theory of frequency. Ark. Mat. Astr. Fys. 12, no. 20. 


TCKsELL, S. D. (1921). An exact formula for spurious correlation. Metron, 1, 33. 


ma Function. Cambridge University Press. 


plete Gan ' 
E l random variables. J. Amer. Statist. 


he product of severa! 


C.R. Acad. Sci., Paris, 202, 449. 


27 Biom. 43 


[ 418 ] 


A REJECTION CRITERION BASED UPON THE RANGEf 
Bv C. I. BLISS, W. G. COCHRAN anv J. W. TUKEY} 


The experimenter is occasionally faced with an inexplicable, ‘aberrant’ observation in an 
otherwise valid set of data. If itis defective and he accepts it—or if it is sound and he rejects 
it—his results will be biased. By following an objective rule rather than a subjective ET 
pression, he can control, and perhaps minimize, his risk of making a wrong decision. di j 
rule proposed here is intended for data consisting of equal-sized sets of replicate -—— 
ments, the several sets possibly varying in their means but all being samples from popu e 
tions with the same variance. It was developed originally to meet the need for a B 
rejection criterion for use with several of the bioassays in the U.S. Pharmacopeia X V ( 1955) 
Other applications could be cited. nts 

For the test which we propose, the range is computed from each set of » ——— 
there being & sets in all. The largest range is divided by the sum of all the ranges. E 
resulting ratio T is compared with the tabular value for the appropriate n and k at the pr 
ability level of P — 0-05. If the observed ratio exceeds the tabular value (cf. Tu 
and 2), the set represented by this largest range is assumed to contain an aberrant ide n is 
tion or outlier, which is identified by inspection and rejected. Thus the proposed atta 
closely related to Cochran's test ( 1941) for the largest variance in a series. The applica 
of this rule controls the probability of biasing a result by failing to reject an outlier. ere 

The test may be illustrated numerically with data from two biological assays that d an 
submitted in collaborative studies sponsored by the U.S. Pharmacopeia. The first "s i; 
assay of corticotropin from the concentration of ascorbic acid in the adrenal een one 
hypophysectomized rats. Seven rats were assigned at random to each of six dosage a jon 
Each group, in turn, was injected with one of three dosages of the standard propa ine 
(8, to S,) and of the test or unknown preparation (U, to U;). The response y was dae 
Separately from the adrenal glands of each rat. The total for each treatment group i$ £ 
below, together with its range and sum of squares (6s?) : 


Dose Sı S, Ss Ui Uy Os 
E 
‘ 20-22 
Total response Xy 28-92 24-87 20-62 26-95 25-17 0-76 
Range (n=7) 0-86 0-79 0-37 1-49 0-80 0-467 
Variance x 6, 65? 0-442 0-362 0-117 1-642 0-455 


of 

The range for U, (1-49) is the largest and most suspect; when it is divided by the wies 
the ranges, we obtain T = 1-49/5-07 = 0-294. This exceeds the upper 5 % point aeg al bY, 
for k = 6 and n= 7, indicating an outlier. The identification of this group as ih of 
our proposed range criterion may be checked against Cochran’s test; its critical 1% 


T Prepared, in part, in connexion with research sponsored by the Office of Naval Researe 2 jin? 


] 
rq H " op 
i From The Connecticut Agricultural Experiment Station and Yale University, Johns ma 
University, and Princeton University, respectively. À 


<<< =  — 


C. I. Briss, W. G. Cocunax AND J. W. TUKEY 419 


Eu = 0-471 exceeds the upper 1 % point (0-461) in the table by Eisenhart, Hastay 
E E (1947). Among the individual y's in this group (2-68, 3-90, 4-00, 4-02, 4-06, 4-12 and 
E he smallest response not only falls considerably short of the others, but also con- 
for n. to a suspiciously small total response for its dose, when compared with the totals 
other groups. 
E ant example is from a turbidimetric assay of vitamin Bə. Four sets of triplicate 
Rea prepared for each of six dosage levels of ‘the reference standard. One set was 
E in each of four tube racks, together with triplicate tubes for four dosage levels of a 
E. Di à or unknown in two of the four racks. Within each rack the tubes were intermingled 
in adom. After incubation overnight, the percentage transmittance of each tube was read 
a photometer, The ranges of the k = 32 sets ofn = 3 tubes had the following distribution: 
o 7 ® T 
D 1 1 


1 2 3 4 
5 6 8 4 


pw oo 
ww 


Observed range of 3 0 
No. of sets 1 
Boon the total of the ranges (= 113). the observed ratio T = 13/113 = 0-115. Since k> 10, 
a aed Table 2 with (k+2)7 = 34x 0-115 = 3-91, which much exceeds the value of 3-08, 
bs "polated between 3-03 and 3-11 in the column for n= 3. On the basis of all 32 ranges, 
B. tribute the range of 13 to an outlier. The three readings comprising this set show trans- 
i ances of 24, 34 and 37 %, from which the reading of 24 % would be rejected as aberrant. 
Eos ratio may now be recomputed with the largest of the 31 remaining ranges in the 
0 to obtain T = 8/100 = 0-08. Since 33x 0-08 = 2-64 is less than the interpolated 
€ (3-07) for n = 3 and k = 31, no other observation would be rejected from the series. 


DISTRIBUTION 
The Critical values of the ratio T in Table 1 have been computed for k = 2 to 10 ranges, each 
termined from a set of n = 2 to 10 measurements. Table 2 gives critical values of (k + 2) T 
e situations that occur most frequently. 


0 
, 10 to 50 ranges. Together they should cover th 
ally with a common standard 
we may take g = 1 without 


ginal variates are assumed to be distributed norm 
10n within groups. Since our criterion 1$ homogeneous, 
nor generality. 
dian” We, ..., Wp be à set of k 
ntically distributed. Let ws 


samples, each of size n and independently 


ranges from normal i 
W the sum of the w's. Then 


be the largest w and 


LN a 
T, = y 7 (W ws) e 


wa (W — vs) Sia 
7 Tr w,|(W—-vs) 14+-S,-1 


i Wi 

here Ws Bos - 
S. = = maxi p.d Wa. +. HWk 
p Wy + a vee p Wy Wx q P i=l i+l k 


Th por — 
P he largest of k identically 
ics the distribution of 7, is determined by that of Sp which is the largest of k identically 


sty 
“buted quantities of the form 
Wy 


SHE T 
Ta Wa T Us T ee TU. 
27-2 


420 Rejection criterion based wpon the range 


The argument used in developing Cochran's test (1941) for the largest variance can E 
applied to relate the percentage points of Sj. to those of £j. ,. The value of Sj. , for P = 00 
falls between the values of E, , for 

P,-0:05]b and P,-1-—(0-95)!*. 
Roughly, P, is 0-98P, for all k; for k = 2, the two values are P, = 0:025 and RP= 0-02532, 
and for k = 20, they are P, = 0-0025 and P, = 0-002558. Since investigation by one of us 
suggested that the desired value of P, , was closer to P,, we have used the compromise 
probability P* = 1(2P,+ Pj), except where T; > 3 (that is, Sj, > 1) when P, is exact. 

The calculation of critical levels of 7, for P = 0-05 is thus reduced to the calculati 
critical values of E, for P = P*, from which 


on 0 


T EN ae 
k, 0-05 è 
1+ Ry, ps ' 
i r ; -oximation 
Since the numerator and denominator of R,,_, are independent, several approxima 


are available. 


Table 1. Upper 5% points of T, the ratio of the largest of k independent normal ranges: 
each of n observations, to the total of these ranges 


SN. No. of observations n in each range 
ranges T 
k 2 3 4 5 6 7 8 9 2 
— sa. | 
.082 
2 0-962 | 0-862 | 0-803 | 0:764 | 0-736 | 0-717 0-702 0-691 i 
3 -813 -667 -601 -563 -539 -521 -507 :498 .382 
4 -681 -538 479 | -446 -425 -410 -398 -389 .314 
5 -581 -451 -398 -369 -351 -338 -328 :320 
-2617 
6 0-508 0-389 0-343 0-316 0-300 0-288 0-280 0-273 EU 
7 451 :342 -300 -278 -263 -253 -245 +239 208 
8 -407 -305 -267 -248 -234 -225 -218 :213 .188 
9 -369 -276 -241 -224 -211 -203 -197 -192 472 
10 -339 -253 -220 -204 -193 -185 79 174 


s ; ent 
Table 2. Upper 5% points of (k-- 2) T, where T is the ratio of the largest of k independ 
normal ranges, each of n observations, to the total of these ranges 


Noro No. of observations n in each range 

ranges 
P 0 

i 2 3 4 5 6 1 8 9 BP 
2-05 

10 4-06 3-04 2-65 2-44, 2-30; 2-21 2-145 2-09 2.04 
12 4-06 3-03 2-63; 2-42, 2.29 2-20 2-13 2:075 2.025 
15 4-06 3-02 2-62; 2-41; 2-28 2-18, 2-12 2-065 9-01 
20 413 3-03 2-62 2:41; 2-28 2-18, 211 2-05 2.01 
50 4-26 3-11 2-67 2-44 2-29 2-19 2-11 2-06 d 


C. I. Briss, W. G. COCHRAN AND J. W. TUKEY 421 


a APPROXIMATE P* VALUES OF Rg 
he criti m E ; 
itical values of E; , were approximated by two methods discussed below. Most of 


th i ; 
e values in Table 1 were obtained by method A, and most of those in Table 2 by method B 


The t o ol 
WO agree modera ely w i i i 
t ell. x U EN: i 
g a h 1 and inter mediate tabular v alues give some W eight to both 


methods. 
Re g 3 s 
range or total range is often approximated by a multiple of the square root of a x°- 


varia: : * 5 : 

E. te, i.e. by a y-variate. Equivalent degrees of freedom v and scale factors c for a single 

ar, s E: sf 

Bo ge REC given by Thomson (1953), and for mean ranges by David (1951). From these, the 
proximate distributions of w, and of w are l 


| Wot Wy tee Ue 
wyc” and g.5iu e 


Ca Xal Va 


WI 
hen requir 
à required, more accurate scale factors for mean or total ranges can be found from 


ege 
M 2 + V,/k (David, 1951). 
ethod A consists of the straightforward approximation 


Wy W, Vo ĉi 
A 


Ry = = ~ 
KA w + Wgt- EUR (E 1)caxa (6-10 
Where jp — 1 [e z 
dm = and F has v, and v, degrees of freedom. The required values of /.F foi 
eriti A2 1 
a levels of P* are obtained by interpolation in à table of F. 
ethod B is less direct and more refined, the x approximation appearing only in th 


der mi 2 
no 
ninator o k-i* 
fR Let We Qs T +++ Wk 


w= 7 (E-1)6 


| "eng iso? 
he expected mean square of wis o?, and we can regard 


| R Wy END 
k-1 7 w+ Ug ec Ue 7 (k—1) coe 

with the scale estimator tw. We know the result o 

ith the scale-estimator based on vg degree 

F,, s)? and the p* value is increased i 

p* value of E, ,,. If we tak 

(k — 1) €g at P*byu 


D result of studentizing w,/(¥— 1e 
of fro bea a y-variate (at probability PS) bí 
E = lom. In that case (Gu is converted into ( 
"i io J(F|F,), where F is the P* value of F, v, and F is the 
ame studentizing factor as applicable to the studentization of w,/ 


Weh 
ave, as the method B approximation, 


w F 
Pea EME 
im nit normally distributed items at P* (Pearson, 1942). T 
B the accuracy of linear interpolation for non-tabular values of Vy V2 and P*, selecte 
ues of ,/(F/F,,) were first computed from the five-figure tables of Merrington & Thompsc 
943), supplemented by the Hald tables (1952) for larger values of v, and for P — 0-001. 
he two approximations differ by the factor 


wh 
e , 
Where w is now the range of 2 u 


wje, _ wla 
A Fs PARLA d 
in the tables are correct to 


i is not claimed that the entries s 
most of them will not differ from the true values by more than à few units 


the last figure given, but we belie 
in the last place. 


th 


422 Rejection criterion based upon the range 


which would be unity if the y-variate approximation to the numerator range were perfect. 
Method B has the advantage of allowing for idiosyncrasies in the numerator range, rather 
than hiding them in a y-variate approximation. For k> 2, the total range in the denomi- 
nator is approximated more closely by the y-variate than is the simple range in the 
numerator. t 


THE CASE OF n=2 
For ranges of 2, w, = 42 | x |, where x is a unit normal deviate. Methods A and B are ko 
identical, since w is exactly a y-variate. The desired critical values are related directly 
to those of Lord’s (1947) statistic 
d, | a | (k-1)ds|w] | 


D Wa d- Wat... FW,’ 


u(k—1,2) = 


ple of 


where W is the mean of k—1 ranges, and d, = 1-1284 is the average range of a sam ite 
n = 2. The critical values of R, are thus 4/2/d,(k— 1) times those of u(k— 1, 2). E. 
rather substantial interpolation for P* at some values of k, we have used Lord's tables 
computing the values for n = 2 in Tables 1 and 2. 


REFERENCES 


:antoB 
Cocuran, W. G. (1941). The distribution of the largest of a set of estimated variances as & fraction 
their total. Ann. Eugen., Lond., 11, 47. , ‘na, 38, 393 
Davin, H. A. (1951). Further applications of range to the analysis of variance. Biometrika, alysis. 
EisENHAmT, C., HAsTAY, M. W. & Warris, W. A. (1947). Selected Techniques of Statistical An 
New York: McGraw-Hill Book Co. 
Harp, A. (1952). Statistical Tables and Formulas. New York: John Wiley and Sons. 24, 4l: 
Lorp, E. (1947). The use of range in place of standard deviation in the t-test. Biometrika, ee (F) 
Merrineron, M. & THOMPSON, C. M. (1943). Tables of percentage points of the inverted 
distribution. Biometrika, 33, 73. ‘ons 
Pearson, E. S. (1942). The probability distribution of the range in samples of n observatio 
anormal population. Biometrika, 32, 301. 
THomson, G. M. (1953). Scale factors and degrees of freedom for small sample sizes for X- 
tion to the range. Biometrika, 40, 449. tical Res 
Turkey, J. W. (1956). Every man his own studentizer. Memorandum Report 58. Statisti 
r Grp. Princeton University. 
United States Pharmacopeia XV (1955). Easton, Pa.: Mack Publishing Co. 


from 


approxi™® 


T Since the completion of the present calculations method B has been studied further by on? slight 


authors (Tukey, 1956). This investigation indicates that, as used here, method B should giv 
but only slight, overestimates of the critical values. 


[ 423 ] 


l| 


CONFIDENCE INTERVALS FOR A PROPORTION 


Bx EDWIN L. CROW* 
U.S. Naval Ordnance Test Station, China Lake, California 


]. SUMMARY AND DEFINITIONS 


Tables of confidence intervals for a proportion based on the sample proportion are presented, 
calculated by aslight modification of the method proposed by Sterne (1954), for fixed sample 
Sizes up to 30 and confidence coefficients of 0-90, 0-95 and 0-99. This system is compared, 
especially in shortness, with Sterne's system, Clopper & Pearson's (1934), and another, 
Intermediate system. It is assumed that a random sample of fixed size n is drawn from an 
Infinite population containing a proportion 7 of individuals with a given characteristic, 
that » individuals are observed to have the characteristic, and that it is desired to estimate 
7 by means of a two-sided confidence interval. 

While relatively tedious to calculate, Sterne's system and its modification have in 
common the advantage that no system is possible which has shorter total length of the 
etl confidence intervals for r = 0, 1,..... However, they are at a disadvantage if one- 
Sided confidence intervals and tests are also of interest. 

Neyman’s definition of a confidence interval ô(r) (or system of confidence intervals), 
Modified for the case of a single, discrete variable r and one parameter 7, consists of the 
Téquirement that, whatever be 7, the random interval ó(r) cover the true value of z with à 
Probability at least equal to à prescribed number called the confidence coefficient, say 1 — €. 

$yman showed that construction of à system ô(r) is equivalent to the determination, for 


Sach 7, of regions of acceptance A(n) such that: 


(i) P(re A(m) |n) 21—6 
(ii) Every r is included in at least one A (7). hd 
(iii) The set of values of 7 whose regions A(z) contain r 1$ à 
This interval is the confidence interval for 7 to be used when the value r is observed. 


In the present case of binomial sampling 7 takes only the integral values 0, 1, 2, ..., 1%, 


a r< r. sav. 
Nd the A(z) will be taken as sue <r <r say, such that 


closed interval. 


cessive integers 7, 71 


! n-r 
$ pau 2 1-6, py) E ERIS —7) . (1) 


r=" 
t uniquely determined by (1). Four different further restric- 


Th . 
~2e end-points r, and rg are no t 
ue are considered: 


tons whic} them uni i 1 
(1) The Pin Teme seth ó,(r) Say; is determined by choosing ‘central’ accept- 


$ : llest r with each tail robabilit 
ce regions 4 (7), so that r, 18 the smallest r p y 


ot more than 4e. 
(2) The system ôą(r) is determined by 


n 
dA 0, x Pn, rT) SE (0&7 «vo (2) 


r=r +1 
AmeA m (moo eT d 


the largest 7 and ra 


choosing acceptance regions A,(7) such that 


the National Bureau of Standards, Boulder, Colorado 


N * Now at Boulder Laboratories of 


424 Confidence intervals for a proportion 


where 7yo is the Clopper-Pearson upper confidence limit for r=0, determined from 
(1— 71:5)" = 1e, and by choosing A,(z) for 7 > 1 as the symmetrical regions consisting of the 
integers x —7', where r’ is in As(7'), 7’ = 1—7. 

(3) The Sterne system à;(r) is determined by choosing acceptance regions Ag(7) as those 
values of r with the largest probabilities of occurring, i.e. the 7’s are chosen in order, starting 
with the most probable and continuing in both directions from it until (1) is satisfied. à 
two values of r have equal probabilities and both cannot be excluded from the dE 
region, then both are included. (The latter provision yields a larger acceptance region i d 
necessary at a finite number of values of z, but it makes the confidence intervals close 
without increasing their total length.) 


1 


a 


——ox 
| 


0:9 


0x -— 


. | | 
08 | 


exo— 


07 


— 25-1 - 


06 P 


x—o 


TOS ? 


x~o- 


0-4 


—0—x 
© 


0:3 


0:275 
0:232 


02 


04 


x-o- 


o 
x 


x—o-+ 
x 
SEI I4]. .-]---——- 


1 
2 3 4 5 6 7 8 
x ô Limits O ô, Limits 
9 ò, Limits where they differ from Oy 
Fig. 1. Confidence intervals ôi» 53, 0, for n=9, €— 0-10. 
ine 
i e ermi”! 
(4) The modified Sterne system ô,(r) is the same as 6,(7) except for the limits det e Ie? 
by the operation called substitution in $2. Weshow in $3thatthe Sterne system pe + this 
total length because A, (71) is minimum for each 7, except for a set of zero measure. rs 8 Ly 
minimum property still holds if the substitution of one value of r for another aoe wo T 
smallest value of zr (<4) consistent with (i) rather than at the value of z where tP o 0 tho 
Ms equally probable, subject to the restriction that no r's other than the ys sd 
substitution become involved. For example, in the case illustrated in Fig. 1 (to be the valu? 
later) substitution ofr = 0 forr = 5 may occur as early as 7 = 0:232 rather than y termi o 
7 = 0:275, where p, (77) = Po,s(7). Since a substitution at 7 < 1 simultaneously dn de 
an upper confidence limit for a small value of r and a lower limit for a value neare arg! 1 
earlier substitution has the advantage of transferring a given subinterval to ? 


Epwin L. Crow 425 


where it is relatively less important; this is the sole advantage of ô, and 63. Thus in Fig. 1 
the Subinterval (0-232, 0-275) is transferred from r = 0 to r = 5, so that the 90 % confidence 
interval 9, for r = 0 is (0, 0-232) and that for r = 5 is (0-232, 0-790). However, the natural 
Irregularity of à is augmented sufficiently by the above definition of ô, to cause the 90 % 
Interval to be longer than the corresponding 95 % interval in three independent cases for 
n30 (n=7, r=5; n=10, r=5; n=10, r— 7), in one of which (n—10, 7=5) the latter in- 
terval is covered. (No 95 % interval for n < 30 is longer than the 99 % interval.) Such in- 
teresting and theoretically permissible, but practically undesirable, results are prevented in 
ô, and Table 1 by the additional restriction that no substitution is made before the point 
at which the length of an interval would equal that of a corresponding interval with larger 


Commonly used confidence coefficient. 


T DU wo T T T 


0-1, 7-9 


Probability 


o 
o 
a 


0:3 04 05 


0 041 02 T 
Fig. 2. Probability that the Sterne intervals à, fail to cover the true 7 (n29,6— 0-10). 


direct satisfaction of condition (i) on acceptance 


regions, Tt can be readily shown that conditions (ii) and (iii) are also satisfied by ô, and às. 
9Wever, condition (iii) is not always satisfied by ô, or à, for a reason to be discussed near 
i end of $2, At least five such cases occur in ô, and 6, for n <30: c—0-10, n= 20, r=0, 
nd, 1;6—0:01,n—27, 722; €— 0:10, n= 28, r=2;€=0-01, n — 28, r=0. In the first case, 
for example, the ô, confidence set for r=0 consists of the two intervals (0, 0-127) and 
(141, 0-147). In order to obtain confidence sets which are intervals in such cases also 
Without increasing the size of the acceptance regions, the definition of A,(7) is altered by 
*eplacing its least probable value of 7 by the next less probable value if this can be done 
Without violating condition (i) on A,(z). This has been possible in calculations of à; and 6, 
rng 30; e.g. in the above case the ô, and 6, intervals for r 2 0 are (0, 0-127) and (0, 0-126) 
"espectively by virtue of a confidence interval for r=6 of (0-141, 0-500) rather than 
"147, 0.500). 
similar definition of a confidence interval can be used for the parameter of other one- 

bp ameter discrete distributions. A table of à, for the parameter of the Poisson distribution 


lg É 
emg constructed. 


The four systems are defined above by 


426 Confidence intervals for a proportion 


2. CALCULATION OF TABLES 


Because of the asymptotic normality of the binomial distribution, the differences among 
the four systems defined above are of interest only for small sample sizes. For the purpose 
of comparison all four systems were calculated to at least three decimal places for n «20 
(and à, for à < 30), although several tables provide 6, in this range as well as beyond (e-& 
Hald, 1952). The values of 7 for which the tail sums of the binomial distribution exactly 
equal a specified probability are fixed percentage points of the Incomplete Beta-Function 
given by Thompson (1941) (repeated by Pearson & Hartley, 1954) and Clark (1953). They 
can also be obtained, accurate to about three decimal places, up to n= 150 by linear et 
polation in the tables prepared by the National Bureau of Standards (1949) and the U. ; 
Army Ordnance Corps (1952). By symmetry the acceptance regions need be determine i 
only for 7 < 1. The confidence limits for 6, and 6, are given respectively by 100c/2 and a cont 
bination of 100c/2 and 100e percentage points of the Incomplete Beta-Function. roid 
In contrast, the confidence limits for à and 6, correspond to no fixed percentage --— 
of the Incomplete Beta-Function and must be calculated by frequent reference to t 
individual terms as well as the sums of terms; these are both tabulated in the Ne 
Bureau of Standards Tables for m = 0-01 (0-01) 0-50. Beginning with z — 0, we determine i: 
complements of the acceptance regions A,(7) (or A,(7)), that is, the values of 7 with i x 
smallest probabilities p,, ,(7) such that > p, ,(7) «c. For 7 sufficiently near zero all such 
F 


um. n ively 

are consecutive integers ending in n. Hence we enter the table of sums and success! 

determine: " 

(1) 7,1 such that X Pryr(™) <ein Tro = 0<7<7y, 
r= 


n 
(2) Tra such that X pun) X6injy;«TXTr, 
r= 


sufficiently 

rring 
"s red) 

one value of r from the complement of A,(7) to A,(7) (no other value of r being T. he 

is called elimination. The point r=0 enters the complement of A,(m) by addition " 

7 —7,;, Such that (4) 


Pn,o(7) + X pun) E 6 


up to the largest such value, say Tz, 1, less than the myg at which p,, o(7) becomes " 
small to be included in the sums Y; D», r). The operation thus performed of trans 
T 


tha 

if any such 7 exists, or if not, by substitution for r=k at the m=7y9=7Lk ee re 

Pn,o(7) = p, (7). The calculation for à; differs only in that the substitution of r=0 

occurs as soon after m =77 p1 as " (5) 
Pn, o(7) T X Pn,r(™) 

decreases to e. ad als? 


The calculations of 6, and à, are illustrated in Figs. 1 and 2 for n —9, e— 0* 10; Fig "ues 
shows the corresponding Clopper—Pearson intervals 6,. There are eliminations at a ate 
of 0-012, 0-061, 0-129, 0-210, 0-390 and 0-485 (lower confidence limits for the elim” itio? 
values of 7), a substitution at 0-275 (where p, o= Po,5) for 04 or 0-232 for à, and ap ® um 8 
at 0-391. In Fig. 2 each curve is labelled with the. values of r whose probability p uld 
plotted. As indicated by dashed extensions of the probability curves, the substitut! S 6 P 
be made, with the same minimum length of acceptance region, as late as 0:301 W1 1 


- 


Epwin L. Crow 427 


- sum or as early as 0-232 with pg 9 in the sum. As shown in Fig. 1 the substitution simul- 
aneously determines 7;5 as the lower confidence limit for r =5 and its equal, myo, as the 
Upper confidence limit for r — 0. D 
E seien of ôs or à, may proceed from any value of z by determining: (1) the next 
C E 7, if any, at which addition is possible; (2) if not (as is usually the case), the next 
Ei of 7, if any, mb which substitution is possible; and (3) if neither is possible, the next 
a ue of 7 at which elimination is necessary. This procedure assures that the shortest 
Possible acceptance region is maintained for all values of 7. For example, if zy is deter- 
Mined by substitution, we determine next whether any 7 satisfies 


par pua). X Pnolt) = e (6) 


=k+1 
If So, the solution is 7y1 by addition. If not, we determine whether p, (7) decreases to 
Pnitea(™) before (5), which has a minimum, increases to ¢, and in the affirmative determine 
E Tr r41 for 04 by substitution where (5) with Pu (77) replaced by p,,1(77) decreases 
à n the negative 77, 44138 determined where (5) increases to e. We continue in this way 
T o m=}, determining whether an addition, substitution, or elimination occurs next. 
he 95 or ô, confidence interval for z when r ‘successes’ are observed is then (7z, Myr) 
Cus reason for the occurrence in the initial definition of ôs or à, of confidence sets con- 
= ae two intervals may now be discussed. The probability sums such as in Fig. 2 that 
fro ude two tails always have a smooth minimum which occurs at a value of 7 different 
um that at which its largest term (from the left tail) becomes equal to the next largest 
"erm (from the right tail). All calculations support the conjecture that the latter value of 7 


I8 always larger (7 < 3). Then e may be just such a size that the two-sided sum increases 
rom its minimum through e where its largest term is still from the left tail and hence should 

? eliminated rather than a term on the right. This would give two intervals of z with A(z) 
Containing the corresponding left-tail value ofr. For each two-sided sum there would always 


$ Such an e, but for any particular € it would occur only rarely and can evidently be avoided 


a : 
8 Specified in $1. 

i Table 1, comprising the 9, 
Sefficients 0-90, 0-95 and 0:99, was calculated i 
Agrangian interpolation in the National Bureau of 


^ ardner checked the entire calculation independently 
"lator using automatic coding. The total time with the latter was about 50 hr. spread over 


two or three weeks, while desk calculation required about 200 hr. spread over many months. 
| differences (mainly of 1 in the third decimal place due to inadequacy of linear inter- 

Polation) were reconciled, so that Table 1 should have no error. The tables of Thompson and 
ark were used for checking the limits that arise with all the probability e in one tail. 


nce intervals for n = 1 (1)30 and confidence 
n the above manner using 2-point to 7-point 
Standards Tables. In addition, Robert S. 
on a high-speed electronic cal- 


system of confide 


STEMS OF CONFIDENCE INTERVALS 


3. COMPARISON OF SY: 


under any conceivable definition of ‘shortness’ because all 


f 5, and some of 6, contain points outside the corre- 


hand, though the à; and 6, intervals more often than 
e fourteen intervals out of the total of 


the corresponding intervals of both à, 


i * System ô, is shorter than 2, 
a are contained in those o 
Ei. ding intervals of dy. On the other 

contain the corresponding ôs intervals, there ar 


& 0 à, intervals calculated for n « 20 which. contain A 
nd ô», and 18 (63) are longer than the corresponding intervals 2,(9;). 


428 


Confidence intervals for a proportion 


Table 1. Table of confidence limits for a proportion 


Calculated by Edwin L. Crow, Eleanor G. Crow and Robert S. Gardner according to à modification of à 


proposal of Theodore E. Sterne. 


| Confidence coefficient (95) | Confidence coefficient (9,) 
ln r -—- n r |——— — n f 
| 90 95 99 | 90 95 99 
| 
1 0. 0:000 0-000 0-000 8 0 06000 0-000 0-000 |11 8 
1| -100 -050 -010 1| -013  -006 001 9 
2| -069 -046 020 
5 S 3| -147 1 061 10 
2 0| ound 4| -240 193 121 11 
1| -051 
2| .s18 5| .255- -289 198 
6 -418 315+ 293 12 0 
— 7 -582 500 410 1 
3 0| 0000 0-000 0-000 8| -745* 6857 —-549 3 
1| -035- -017 -003 =e d 
2| 196 135* -059 * 
3 -464 368 -215+ 0| 0000 0000 0-000 5 
1| -012 -006 -001 ji 
— 2| -061 041 -017 y 
4 o| 0-000 0-000 0-000 S] 29 Ons! 058 8 
AI «osa -votai cane 4| -210 169  -105* S 
2 143 -098  -042 
3| -320 -249 -141 5| 7232 251 — ml 10 
4| -500 473 -316 6| -390 389 -250 11 
7| -4855 442 344 12 
8| -609  .557 402 
5 0| 0000 0-000 0-000 Sy owe on cm 
1| -021 -010 002 — —-|13 0 
2 112 -076  -033 1 
3 247 «189 -log | 10 0| 0000 0-000 0-000 2 
4| -379 343 222 dE WE AE 3 
055- -037 -016 4 
5 621 “500 398 3 “116 087 +048 
B i 4| -188 150 093 5 
3 6 
6 0| 0000 0-000 0-000 5| 223 -222 — +150 7 
1] -017 009 -002 6| -341 -267 — 218 8 
2| -093 063 027 7| -352 381 297 9 
3| -201 153 --085- 8| -500 397 376 
4| 933 27 173 9| -648 603 — 488 1 
5 458 402 294 10 778 -733 -624 12 
6 655+ -598 -464 13 
i F = |11 0| 0000 0-000 0-000 
7 d m 0-000 — 0-000 1 -010 -005- -001 | 14 0 
:015- ‘001 2 -049 -033 -014 1 
2| 079 -023 3| -105- -079 -043 2 
3| 170 071 4| -169 135+ -084 3 
4 279 -142 4 
5| 8 5| -197  .200  -134 
d T sean 6| -302 -250 -194 > 
7 | a 7| -315+ -333 -262 P 


| Confidence coefficient (90) 
mm 


90 95 99 
EM 
0123 — 0-369 ue 
77 500 “40 
-«685- -631 T3 
-803 750 6t 
wi 
0-000 0.000 000 
-009 -004 13 
-045+ 030 d 
096 -072 "Om 
+154 +123 07% 
2 
84 — -18l Ld 
271 236 Er 
394 — 7294 — 305 
+398 346 E 
-500 450 
-445* 
9 550 we 
so Gp o 
-816 -764 6 
.000 
0-000 — 0000 us 
908 ` 004 — 5g 
-042 — 7028 — ogg 
-088 — 7000 — 059 
+142 113 
pul 
173 do 159 
-246 22 218 
276 260 a 
379327 $92 
-485* 413 
.406 
"E a7 
.621 -566 571 
“724 698 
+827 11 _ 
—————— 
.000 
0-000 0-000 M 
007 004 git 
039 — 028933 
-081 -061 .004 
431. — 104 d 
102 
163 193 46 
.994  -206 , 
.261  :206 


E: RI E N a] 
The observed proportion in a random sample of size n is r/n” The table gives the lower confidence 


the population proportion 7, as a function of n andr. 4 
The upper confidence limit = 1 — (lower confidence limit, entered with n—r instead of 7). 


? 


¥ 


jim! 


" fof. 


p 


Epwin L. Crow 


Table 1 (cont.) 


429 


T 
th he Observed proportion in à r 
Pulation proportion 7, & 
© upper confidence limit = 


e 
p 
The 


Confidence coefficient (95) Confidence coefficient (%) Confidence coefficient (95) 
Wo og SS OF n r 
90 95 99 90 95 99 90 95 99 
1 
4 x 17 0 19 5| 0130 0-110 0-073 
1 6 151 “147 103 
2 7| -209  -150 -137 
n 3 8| -238 - 173 
11 4 9| -265+ 212 
12 
13 5 10| -337 -312 -218 
14 6 11 | -386 3457 -293 
7 12| -386 -365- -305+ 
8 13 -440 426 -383 
15 o| 0-000 000 0 000 2 14| -560 -500 -436 
2| E I m 10| -364 337 242 15| -614 574 -485-7 
JI E Ni 11| -432 406 -338 16 | -663 -635+  -545- 
4| dos 007 059 12| -500 456 -380 17| 735- -684 — 017 
ae 2 13 | -568 (511 “413 18| 791 768  -695- 
El amu m 14| 636 -583 -500 19| -87 850 782 
5) SEM du 15| -710 — 663 — -587 
al ML de 16| -775- 46 654 |20 0| 0-000 0000 0-000 
E AN — 17| -860 834 758 i| 005 5003 001 
2| -027 -018 -008 
3| -056 -042 -023 
1 , , .328 
dE REN FES jme wi aooo Oopa oao 4| -090 -071 -044 
(poo as ATS 1| -006 -003 -001 
2 “600 552 461 2 -030 -020 -008 5 +126 104 -069 
la| AA NE. 3| 063 -047 -025+ 6| -141 -140 -098 
4| .753 608 027 4| -101  -080 049 7| 20  -143 129 
8| -221 -209 -163 
15 | -846 -809 727 5| a35- -116 -077 9| -255-7 -222 -200 
P c a 6| 168 156 <10 10| -325- -293 -209 
1 7 :316 -157 "145+ 11 -358 .293 274 
6 o| 0-000 0000 0-000 8| 257 -236 -184 JE NE 208 
> ies bes ps 9| dur 2a am 13| 422 -411 — -303 
: :02 : 14 | -500 467 -399 
3| -071 -053 +029 10| 349 — 3 -228 9 
4| 14 -090 055* 11| -416 — 53757-77314 15| -578 533 +424 
12| -464 -381 — 7318 16| -633 589 +500 
5| -147 -132 -088 13| 518 ‘444 397 17| -672 -649 576 
6| -189 -178 125+ 14| -581 -556 466 18| -745+ -707 -625+ 
7 .235+ 178 166 19 797 -778 -707 
8| .299 -272 -212 15| -651  -619 “534 i 
9| .ao5 -272 201 16| 723 -675+ 603 20| S74 857 — 791 
17 | -784 758 -682 LN 
io| 991 352 -295+ 18 -865+ 843 772 
11| -450 429 -307 b 21 0| 06000 0-000 0-000 
12 -550 -500 421 1 “005+ -002 +000 
13| ‘e19 571  -475*| 19 0| 0000 0:000 0-000 2| 026  -017 -007 
14 | .695- -648 -549 1| -006 -003 :001 3| -054 -040 022 
2| -028 -019 -008 4| -086  -008 -04l 
ts | wer. 58 .643 3| 059 044 024 g 
16 pe Een -736 4| -095+  -075* -046 5| -121 099 -065+ 


|» 
{dom sampln of ae ni 
s a function of n d 

1— (lower confiden 


ce limit, 


s r/n. The table gives the lower confidence limit for 


entered with n —7 instead of 7). 


430 


Table 1 (cont.) 


Confidence intervals for a proportion 


| | | k 
| | Confidence coefficient (?5) Confidence coefficient (95) Confidence coefficient (%) 
| n a }————_—_ — DO t — 4& € |—L———— — 
| 90 95 99 90 95 99 90 95 » 
(21 6 0-092 | 23 0-090 0-059 | 25 o| 0-000 0-000 Mes 
7 4120 — 084 1| -oot -002 toe 
8 )5- 427 — 4H 2| -o21 “O14 — "ng 
9 89 4178 — 040 3| -045- “034 logy 
-198 -171 4| 072 057 
10 5 
Ae 5| -101 082 2 
i2 6 -101 110 101 
13 7| 158 u8 jy 
14 8 158 161 m 
9| -214 185* 
15 404 -409 .176* 
16 | -545- — 406 10 222 — '505¢ 
17 | 602 11 238 — 5t 
18 662 12 206 — pt 
19 724 13 3173057 
14 336 
20 | -809 787 717 342 
21| 877 -863 799 15 | -389 352 
| 16 432 403 
———— a 17 -500 451 
122 0| 0-000 0-000 0-000 i Ed " 
1| -005- -002 -000 549 
2| .024 -016 -007 T o1 2e 
e eie -638 597 
3| 051 -038 -021 2 693 004 B 
4| -082 :065- -039 745 -697 * 
í 24 0| 0-000 0-000 0-000 22 SE 762 o 
| 5| -115- -094 -062 1| :004  -002 000 22| das Bio MN 
| & iiss ok ds 2| 022 015: -006 " 
| 7| 181 +132 -116 Si spam +019, 25| -899 882 
| 8| -181  -187 147 Ar “OBS 088 gu 
9| -236 205+ 179 inim c 
" aem .000 
5 -086 -057 .000 9 
000 00 .900 
10.| -289 -260 -194 6 115- 080 | 6 J “Oo :002 p 
| 11| -289 -264 342 7 122 +106 2| 021 014 Q7 
| 12| -340 -326 -273 8 ME 3| 043 0382 — 933 
| 13} -393 -383 318 9 uo ies 4| -o69 054 
14 -444 418 :334 052 
10 | -259 -234 -181 &| -097 079 07 
15 | -500 424 -396 11| -264 -246 -216 097 108 ggf 
- : 6 0 
16| 556 -500 450 12| -317 -308 -257 7| 151 14 39 
17| -607 -576 -495+ 13 | -970 “389 280 8 151 1548 — 449 
18 | -660 -611 546 14 | -413 -347 — «313 9| 209 180 
19 ‘711 -674 -604 .110 
15 | -447 -396 -362 10| -233 7212 — 495 
20| -764 -736 -666 16 | -447 -443 -364 11| -247 7230 — 934 
21 | -819 -795-7 -727 17 | -553 -500 -416 12| -299  :282 934 
22 | -885+ -868 -806 18| -577 -557 -464 13| -342 282 298 
19 | -630 -604 -536 14| -342 © :329* 
ZSR "EN 
23 0| 0-000 0-000 0-000 20. 089 | 2653), SEA in| To deo e 
TMEGDES Te sone" — -000 21| -736 -692 -636 Te aia . 45. Gu 
2| -023 -016 -007 22| -779 -754 -687 17 | -460 Ho .498 
3| -049  .037 020 23 | .835* 809 -741 18| 540 495 — 4n 
4| -078  .062 038 24| -895+ 878 819 i9 BBI 535 
LT j " ó r 
o limit f? 


The observed proportion in a rando 


m sample of size n is r/n. The table gives the lower confidence 


the population proportion 7, 


The upper confidence limit = 


as a function of n and r. 


1— (lower confidence limit, entere 


d with n—7 instead of r). 


Epwin L. Crow 


Table 1 (cont.) 


431 


u The observed pr ption 7, as a function of % EA 
he population. nce limit = 1— (lower confiden! 


The upper © 


pror 
fide 


ortion in & random sam] 


dr- 


e limit, entered with n—r instead of T). 


Confidence coefficient (95) Confidence coefficient (95) Confidence coefficient (95 
X om = n r n T — 
90 95 99 90 95 99 90 95 99 
26 20 | 0-62 km 45 28 5| 0-089 0.073 0-048 | 29 20 | 0-537 0-500 0-438 
21 ST € ys 6| -089 -098  -068 21| -575* -549 -47 
22| -701 -615- -607 7| -139 106 -089 22| .615- -587 
23 | .753 AE 658 8| 339  -142 -112 23| -655+ -626 
2 aS o 709 9| 197 -170  -137 24| -697 661 
4| -791 -770 702 
H 10 25| -721 -701 -646 
25| -849 s20 -766 i 26 | -775+ -749 -684 
26 | -903 s86 -830 12 27| -811 -789 -737 
13 28 | -866 834 -789 
14 29 -914 897 -840 
27 0| 0-000 0-000 0-000 wile’ oo 
1 004 -002 -000 16| -396 -325 
2| 020. I8 — :008 17| 435 384 -364 |30 0| 0-000 0:000 0-000 
3| 042 031  -OI7 18| -473 424 -364 1| 004 002 -000 
4! .000 052 032 19| 527 -463 -408 2| 018 -012 -005+ 
3| .037 -028  -015- 
5 -093 :076 -050 20 565-7 537 4 059 047 028 
6| .003 +101 070 21, -604 576 
7 145+ -110 +093 22| -645+ 616 5 083 -068 -045- 
8| -145* 148 -117 23| 0S8 -643 6| -083 -091 -063 
9| .204 -175-7 183 24| -716 693 7| -129 -100 -083 
aT; 8| -129 -131 104 
10 391 302 -166 25 | -768 741 e 9 +182 -163 127 
.185- 26| -799 788 72 
IL] .239 223 =p Sl 830 “782 
12 | .29] 209 CES 27 c 394 — -838 10 175+ 151 
13 | .396 +269 225 28 s 11 -205+ -151 
E NE eee l bi 
T -000 — 0-000 9 " 
15 “365+ 364 298 | 29 0| 0:000 99 9 3500 14 .992 249 
.332 1 -004 002 
ON UE CE ru 018 -012  -005* T 
17 | .447 430 p 2| 089 0299 -015+ 15| -336 — 324 256 
18 -500 437 u i 3 -062 -049 -030 16 376 +324 308 
19} .553 500 4 4 17 | -416 -364 329 
s| ose 070 -046 18 | -446 -403 345- 
20 -593 -563 +461 086 = 094 065+ 19 | -476 440 388 
21| 635- -585+ 539 6| ase  -108 -086 
22 -674 -636 -581 8 134 -136 -108 20 -508 .476 430 
23| .709 -684 ‘616 SIN Too ES 182 21| -545- -524 -462 
24 “761 -731 -668 / "m 22. -584 +560 495- 
.189 184 157 ox | 052 > 
«703 10) EL cell . -185* Bi | tee “597-531 
BS) roa S m TEE 11 r8 347 +206 “BEEF 088 MU 
26 -855-  -:825* t 12 1504 .951 -911 
27| .907; -890 18! "os 209 -260 25| -705+ -676 612 
14 26 735+ -708 655+ 
5| 345: -339 -263 a "781 — -756 690 
0-000 .385* 339 316 8 818 *795- 744 
Pp 0 oe Eros LOO 16 | i425- 374 -346 29| -871  .837 -794 
à 1019 013  -005* m “463 me E 
030 -016 -500 45 :397 30 :91 . E 
3 | 040. e :081 19 y 900 -849 
4| -064 


ple of size n is r/n. The table gives the lower confidence limit for 


432 Confidence intervals for a proportion 


Neyman's (1937) definition of the shortest system and the short unbiased system of 
confidence intervals, and Scheffé's (1942a) added definition of the shortest unbiased 
system, deal with such less clear-cut comparisons in the case of continuous variables with 
probability densities. However, these definitions have not been applied to discrete variables 
except indirectly, by changing the given variable to a continuous one by adding an amid 
variable with a rectangular distribution, as reviewed by Pearson (1950). In particular 3 
can be easily shown that none of the four systems considered in this paper is either shortest 
or unbiased. While it is true that shortest systems generally do not exist even for us 
tinuous variables, Eudey (1949) obtained shortest unbiased systems for 7 by the obey 
device. Tables of these intervals do not appear to be available. . e: 

The confidence interval systems ô, and 6, have an optimum shortness property in à s 
metric sense. If an acceptance region A(z) of any system is considered to have apu 
to r,—7,+1, then as 7 varies from 0 to 1, the acceptance regions sweep out a region In 6B 
(r.7) plane composed of not more than 27 4- 1 rectangles which may be called the eonfiden 
belt. Thus in Fig. 1 the rectangles containing 4 ,(7), 17 in all, would be, reading from ait 
to top: 

—0-:5<r<0-5, 0<7<0-012; —0-5<r<1-5, 0-012 <7< 0-061; +: 
—0-5 <r <45, 0-210 <7 < 0-232; 0-5 <r <5-5, 0-232 <7 < 0:390; 

0-5 <r<6-5, 0-390«7 «0:391; L5<r<6-5, 0-391 <7 «0:485; 

L5<r<7-5, 0-485<7<0515; ..; 85«r«95, 0-988 <7 < 1-000. 


If the lower and upper confidence limits are denoted by 7;, and my, = 1- Trn- 
area of the confidence belt is 


n 
M+1—2(L.apytl.apyt+...+].azp,) = Y (Tyr Tir) 
r=0 


1 
th +a give 
so that their length is a minimum whatever z (except for a set of measure zero). n adab 

probability 1 —€ is attained with as few terms as possible by including the larges : e fixed 


terms. Since the area of the confidence belt is the integral of the length of A(r) pic ; d stems 
interval (0, 1), it is as small as possible with A,(7) or A,(77). Thus among all possible Á 
of confidence intervals, no system has shorter total length than the systems ôs and òs 

The total length of the system 6, in Fig. 1 is 


2(0-232 + 0-379 + 0:454 + 0-481 + 0-558) = 4:208. 


PETS 
Its ‘mean length’ d, (all r’s equally weighted) is 0-4208, while the mean length MT 
0-4705. The mean lengths d; are tabulated on p. 433 as functions of sample size for "m 
The differences d, — d, are positive and decrease monotonely after a maximum a ciate? 
The differences d, — d; are positive beyond n = 4 and oscillate, larger values being = 
with ‘additions’ in the calculation of à4. The differences d; — d;, likewise oscillate, pee 
absolute maximum of 0-087 at n = 3 and an absolute minimum in this range of 0-019 S i of 5% 
The percentage reduction in mean length from 6, to à; or ô, varies from a minimun 4,-^ 
at n= 1 toa maximum of 129/ at n=3, 5 and 6 and is 6 % atn=30. The differente tag 
for e= 0:05 and e= 0:01 are essentially the same as for e = 0-10 beyond n= 6; the pe 0 for 
reductions are then less than for ¢=0-10 but decrease more slowly (to 4% at ee 
€ — 0:01). The large-sample confidence intervals based on the normal approximat 


Cher, >p 
& leffé (19425) showed that when it is ap 


Epwin L. Crow 433 


binomial have, for n= 30 and ¢=0-10, a mean length that is 0-013 less than d; and 0-029 
less than d,. 


I 
| 
n 1 2 3 | & | 8 6 7 8 
| 
d, 0-950 0-834 0-740 0-667 | 0-612 0-565 0-528 0-497 
d,=d, -900 -755 -653 -604 -540 -498 -490 -451 
d,—d, -050 -079 -087 063 | -060 -042 -033 -030 
d,—d, -000 -000 :000 -000 012 -025 -005 -016 
n 10 12 14 | 16 18 20 25 30 
dy oaas | 0-410 | osso | 0-355 | 0-335 | 0-317 | 0283 | 0:257 
d,=d, “417 -373 -350 -328 -306 -296 -267 -242 
d,—d, -023 *017 -013 -010 -008 -005 -004 -003 
d,—d, -008 -020 -017 -017 -021 -016 -012 -012 


gration with the like mean length 
plicitly (Graph 13), n=5, 
me to graphical accuracy, 


ts mean lengths d; may be compared by graphical inte; 

udey's shortest unbiased system for the case he gives ex 

dan = 0-80. The mean lengths of à; and Eudey'ssystem are the sa 
7; this is 7% less than d, and 17 % less than dy. 

Brose advantage of ô, over à; can be measured for giv 
= i all v from 1 to [3n] of the ratio of the length d; , 

lvided by r/n, that is, _ [m d. 

duc ox c 


on > 
r=1 r 


en e and n by comparing the averages 
of the confidence interval for a given 


Where [42] denotes the largest integer contained in 1n. A few values are indicated below 


Or c — 0.10: 
| | 
| n 5 | 10 | 15 | 20 25 | 30 | 
Em | | 
Ih 2-54 | 2-09 | 1-94 1:69 159 | 1:45 
R,-R, 0-29 011 | 012 0-12 0-08 0-07 
R,-R, | -005 | 004 | 0-07 | 0:03 | 006 | 0-06 
I | | 


The negative value of R, — E, results from the necessary omission of the term for r=0 in 
us Averaging. . 

of fidence intervals are often chosen by minimizing their expected length irrespective 

e ihe parameter value (Kendall, 1946, p. 72; Scheffé, 19426, 1943). A disadvantage of this 

“terion is its dependence on the particular form of the parameter, e.g. 7 Or 2 arcsin /7. 

plied to the logarithm of the ratio of variances 

ystem. The expected lengths 


hortest unbiased s; 
Ng 92,53, 6 » and Eudey’s shortest unbiased system, say 65, for n= 5, e= 0-20 were evaluated 
Sing the National Bureau of Standards Tables and are graphed in Fig. 3 as a function of 7. 
= Biom. 43 


0 re us 
m two normal populations, it gives the s 


[ 436 ] 


SERIAL CORRELATION IN REGRESSION ANALYSIS. I 


By G. S. WATSON* axp E. J. HANNAN 


Australian National University, Canberra, A.C.T. 


1. INTRODUCTION 


e effect of making 


E — " TE I 
In a previous paper (Watson, 1955); a theoretical analysis was made of th p 


i i i i i regres! 
a wrong presumption concerning the correlation matrix of the residuals from a reg 


of the form 
Yı £u es Xa] Uy (1:1) 
Yn Eye tI lex uy 


or y-Xg-cu 


when the vector u has zero expectation and 
E(uu' = c?« (a non-singular). 


In particular, bounds were obtained for the following quantities: py the 

(a) The efficiency of the estimates of the //; (see equation (1:3-5)). This is measured S d 
ratio of the determinantal values of the covariance matrices of the best linear unbi 
estimates and the actual estimates used. 


(b) The significance points for the test of the hypothesis 


fiP=¢, (i21... h«k), 


where f;, à; are given and the f; are linearly independent. The test statistic fef 
appropriate to the case a = I (see equations (1:4-33), (1:437), (1-444) and (ASA al 

The bounds in each case were shown to depend upon the latent roots (par ixi 
the extreme latent roots) of the matrix HaH', where H is such that HyH' = t roots 
is the presumed correlation matrix. These latent roots are therefore the Jaten 
of ay™. 

In this paper we go on to study cases where the errors, u; are generated A i 
stationary processes whose form or parameters are incorrectly prescribed a n "jn $9 
the spectral theory of stochastic processes is used to study the latent roots efey ` 
the functions of these latent roots which are needed to examine the bounds to pred 
(a) and (b) above are evaluated when the true and assumed error processes are we à ta Je 
sive and moving-average. In $4 the results of $3 are discussed and illustrated = now! 
and graphs. In general, the bounds are wide, indicating that the effect of no ic 7 ult 
the true form of the error process may be severe. In $5 certain asymptot! may 


: : un 
are established to show that, in many cases which occur in practice, the bo 
be attained. 


d was that 


tain 
y cer 2 


m 
anti 
u es- 


"d 

Dep 

* The results in §§3, 4 were obtained while one of us (G.S.W.) was a member of the pol? 

of Applied Economies, Cambridge. their ?' 
T For convenient reference to equations in Part I of this paper, we merely prefix 

by unity; thus equation (1-2-5) is equation (2:5) of Part I. 


G. S. Watson AND E. J. Hannan 437 


2. THE LATENT ROOTS OF ay? 


Consider the case where the u; are generated by a stationary process with absolutely 
continuous spectral function, the spectral density being 


f(e) = Ee, (2-1) 


the p, being the serial correlations of the process. 
If we assume that the true spectral density is g(w) with 


2 


gw)? =k (b 1) (2-2) 


bed m 
310,0 V9 
0 

ive transformation to (1-1) obtaining a system of errors 


then we shall apply on autoregress 
which will be, at least asymptotically, of the form 


v, = kt x bu; j. : (2:3) 
The spectral density of the v; will then be 
fo)e , o? 
— = Wo) 2:4 
glo) % : L PH 


Where o? is the variance of the v;. 
The correlation matrix corresponding to (2-4) will be c; ?0?HaH', so that the function 


Benerating, as Fourier coefficients, the elements of HaH' is h(w). For the class of processes 
We shall be considering it can be shown that the latent roots of HaH’ (that is, the latent 
Toots of oy-1) will be, asymptotically, the values of h(w) for equidistant values of the 
argument (in the range —7 to 77) (see Whittle, 1951; Grenander, 1952). 

The extreme latent roots of ay can then be approximated by the least and greatest 
values of h(w) in the range —7 to 7, while the sum of the latent roots will be, approximately, 


7 fr 2 
H= zr h(w) dw = nz, (2:5) 
27 J -a g^ 


We shall indicate the extreme values of h(w) in —7 to 7 by h and hy , 
In the original form of this paper (Watson, 1951), à different approach was used to obtain 


the results of this paper. Autoregressive and moving average processes of the first and 
Second order were approximated by processes with commutative covariance matrices 
Whose latent roots and vectors were known. The present approach is more general but gives 
NO guide to the effect of the approximations. The earlier work showed that for the cases to 
2e considered below the use of the extreme values of (2-4) and the integral (2:5) will result 
M errors which are sufficiently small, for our purposes, even for samples of only 15 or 20. 
Similarly, in formulae such as (1:275), the average of the N — k largest (or smallest) roots of 
€Y-! appear. We shall replace this by the average of all of the latent roots. Again the effect 
Will become small as AN increases relative to K. 

We shall consider cases where the spectral densities f(v) 
9f cos w, which corresponds to the cases where we have 


and g(w) are rational functions 


q 
S Au; = Yusti-; (2:6) 
0 


438 Serial correlation in regression analysis. 11 


the e; being independent random variates. We shall use A; and j; for the true process as 
indicated in (2:6) and Lj, m; for the corresponding parameters in the assumed process. We 
shall use p' and q' for the maximum lags in the assumed process. 

The spectral density of a process, ";, of the form (2-6) is (Doob, 1953) 


(2-7) 


etas [stas 


s is also of the form 


Here c? is the variance of the process e;. When the assumed proces 
aightforware» 


(2-6), the evaluation of (2-5) or the extreme values of (2-4) is relatively str 
though care has to be exercised in some cases. 

The bounds which are to be developed are homogeneous functions of degree 2! 
latent roots so that we can neglect the factors k and a20-?, which appear in h(o), 
affecting the result, and this will be done below. 

Then (2-5) becomes 


ero in the 
without 


x bibs Piss) 


A a 8 
H=Ne® Ey (2:8) 
A 
D UAP- 
ij=0 
3. THE EFFECTS OF A WRONG PRESCRIPTION FOR AN ERROR 
PROCESS OF THE FORM (2:6) i b 
d on th 


The bounds to (a) and (b) (of $1) obtained in Part I of the present study depende 


extreme roots and the sum of the roots of the matrix ocy^? after the removal of the num f d 
sponding to amy regressor vectors which were latent vectors of the covariance matrix ° for 
transformed residuals. In using the approximations h, and h, for the extremes pu ots: 
the sum of the roots we shall, of course, be neglecting the effect of the removal of these d of 
We shall also be neglecting the effect of the value of k, in the bounds to (b). Tre oT for 
resulting errors will, of course, vanish asymptotically, but it also appears to be sme ere 
quite small values of N. We shall evaluate the bounds to the efficiency only ad ae the k 
one regressor vector is not a latent vector. The more general case, where none g ounds 
regressor vectors are latent vectors, can be treated, approximately, by raising the 
for the case k = 1 to the power k. J 

Below, for certain processes of the general form (2-3), we give the value of Mo) ved: 
and h,. In addition we show, sometimes for particular values of the parameters m iby)? 
the lower bound to the efficiency of this estimate, Æ, (the upper bound is, of course 
and the factors 

haw, fea Whe. 
as 

These last factors, when multiplied by the significance point for t? appropriate e gr 
where h(w) is unity and the distribution of the u; is normal, give approximations id i jast 
significance point (for the chosen level). This approximation was discussed i? 
paragraph of Part I of this investigation. 


G. S. Watson AND E. J. Hannan 439 


ü) p=1,q=0,p'=1,¢ =0. 
Here it will be convenient to put —A, = p; and — =": 


1471—2r,coso 


Bu) < 1+p?—2p,cosw’ 


H 2i (1-2 
=——, (1-2 r, +7), 
1—p pir t) 


hy = (=). h = (=) (P12 1): 


1-5 ltpi 
[ets 2 A z 
hy = G 2 diuum = (piri) 


| Q-pgpa-nu* 
17 (peri inatit 


f - (xj 1—pi 
v Em d -3np,tÜ 
(p12 1- 
]4ny? 1-8 
f= (152) 1359.35 
lp 1-2npit" 
When p <r the bounds f, and f; are interchanged. The values of E, fu and f, for various values 
ofp, and rı are shown respectively in Tables 1-3 below. 
(ii) p=2,q=0,p'=1,9' =0. 
Again it will be convenient to put —A, 


147j—2r7,0080 
Mo») = rx XE XE 2A + Ad) cos o + 2À, cos 20’ 


(14A) = P1 and -4 =": 


1401-20: 
H= NT R-N 2A ap 


A423) |. 1 
AS 


Case (a). 290. 


1 
p hT rra 


(see Table 1 for r,—0). 


1 
= (+à)? (1—P1 


E, = (1-0) (1 +5)” 


AG 23)] <1, 
4A, 


Case (b). 1, = 9, 


e then obtained from one of those under case (a) and 


The extreme values A; and ^, ar 
1 


LA ee 
1+A,)? 
"UNE 


(depending on the signs of A; and Aj). , 
E, will not be written down but is shown for some A, and A, in Table 4. 


440 Serial correlation in regression analysis. II 


Case (c). ry = py. 


1 
= = A, > 0), 
eae Tea Mem 
1 1 
, = Az « 0), 
hue h ü-xp (9*9. 


1—A3\2 
B m (; m ` 
See Table 1 for 7, = 0 reading A, for Py T 
The effect of these incorrect prescriptions of the error process on the test of signifie 
are discussed in the following section. 
(ili) p = 0, q 22,p'—0,q'— 1. 
We shall put 


— ll +4) = Mae. a 11 
1O LHRH COIRE "> Teme? 
i= 1+4 4/84 29,1 + Zlo) COS w + 211, cos 2w 
1 +m + 2m; cos w 


: M sm <i: 
We consider only the case where 7, = py. Since r, <} this imposes the restriction P1 «i 


H "ORAT (1+ 22( 1-30 — )) 
lem Pı \(1 — 4p 


The extremes h,, and h, are obtained from 


1-5 p 2p. 252 
remp (+ pel - 2004-1) 
l5 ug 2p ss 
and one of —À 73 2 2 pos 
Lis 1+ i= 2) > (PıP positive), 


L+H () _ 2P 
1+m? 1+ 29, 
Only the bound E, was evaluated and this is shown in Table 5 and Fig. 1. 
(iv) p=0,¢=1, p =1, g =0. 
Again we put /,(1 +3) = fp, and =} =r: 
h(w) = (1+ 3+ 2p, cosw) (1 -- 8 4- 21, cosw), 
H = N+) (1+ 2) - 2r). 
(n —7,) (1 1, r1) 
Apr, 
The extremes of h(w) are then 
(+r)? (1—4)? (0—»n)(.- pu. 
By = (1—13)? (1—48)? Q4 rd — 2r p, + +798), 
which is shown in Table 1, putting 4 = pi. 


), (91p» negative). 


Case (a). =, 


Case (b). |n) ei. 


Apr, 
The extremes of h(w) are 


(4-4) (1 —punp 
2 2 J X^ 
(12-72) (1 id) 4- 4j, 


2 


-Pri à " : 
p Ocesses which give direct least squares an efficiency 


G. S. Warson AND E. J. Hannan 441 


2 


and one of Q+) (4)? (x15) 
(1—7) (1+4)? (r1>44). 
The efficiency Æ, is shown, for this case, in Table 6. 
Other cases may be considered, perhaps for restricted values of the parameters, by 


iced identifying their parameters with those of examples (i)-(iv) above. This is because 
einverse relationship of the spectral density functions ofthe autoregressive and moving- 


a 
Verage processes. Some examples are 


p=1, q-l p'=0, g'=0 > (i), 
p=0, g-Ll p'=0, d-1-() 
p=1, q-l p'=0, g =1 > (ï), 
p=1, q=0, p =0, g =1 > (i), 
p=1, q-l p-l g =0 > (ii), 


=2 p —-0, g'=0 = (iv), 
p =2, g =0 = (iv). 


4. COMMENTS ON THE RESULTS OF $3 


m the last paragraph of $ 3, Table 1 is relevant to several cases but is labelled for the 
e where a first-order autoregressive error process, first serial correlation p,, is mistaken 
a process of the same form but with first serial correlation 7,. 
Table 1 is the most widely applicable of the three tables as it is of use in cases (i), (iia), 


(iie), (iva), and in other situations which, by the last paragraph of $3 are equivalent to these 


cases, The discussion is given in terms of case (i). The only striking feature of Table 1 is the 


Poor performance of the high valuesof,. Thusitappears that the first difference transformation 
may lead to very low efficiencies unless p, = 1; the zero efficiency atr, = lis due to the approxi- 
Mation, Tt will be noted that, except for high p, and 7, a value of 7, within 0-1 or 0-2 of p, 
on to efficiencies of better than 70%. It is only when p; is high that the use of a near 
Xact value of rı is imperative for a reasonably efficient analysis. 
A cns 4 covers the case where a second-order autoregressive process is mistaken fora 
is cess of independent variates. When the procedure appropriate to independent residuals 
used, the first column of Table 1 or Table 4 apply, according as | (A, - 4,49) dar 1/>1or 
XL. Table 4 also shows the corresponding values of (1— p1)? (1+ p2), appropriate when 


A +A, Aq) Az? | 2 1, for comparison. Thus while there are second-order autoregressive 
which is dependent only on the first 
Serial correlation pı = —A(1 A) and is high when p, is small, there are others which 
Make the efficiency very low even when p, is small. 
Table 5 covers the case where a second-order moving average is mistaken for a first-order 
moving average with the correct first serial correlation p;. The results are displayed graphic- 
ally in Fig. 1. For each pair of values of / and jt, the correlograms of the true and assumed. 
a drawn. The graphs have been arranged in the same order as in 
ency is given. An examination of Fig. 1 
he true and assumed correlograms may 


that the common practice of 


err, 
tror processes have beer 
uem 5, and on each the lower bound to the effici 
lows that an apparently better agreement between t 


ad to a worse lower bound to the efficiency. This suggests 


e s 
9mparing correlograms may not be the best procedure. 


442 Serial correlation in regression analysis. II 


As has been seen in $2 the lower bound to the efficiency is simply related to the spectral 
densities rather than to the correlograms. For the cases which we are considering the 
lower bound to the efficiency can be zero only if the true spectral density has a zero. This 1s 
seen to be so for all the cases with a zero lower bound in Table 5. This lower bound will of 
course be attained (or nearly attained) for regressor vectors which are latent vectors of 
the correlation matrix of the transformed residuals corresponding to the zero (or near zero) 
latent root, but it may also be attained, at least asymptotically, for other regressor vectors 
also (see Grenander, 1952, p. 568, and $5 below). m 

Table 6 covers the case where a first-order moving average is mistaken for a E 
Markoff process. We have two cases according as | (ju — r,) (1 — 4 ,) (494,7,)1 | is > OF s i 
The first case is covered by Table 1 with //; = p, and the second by Table 6. Both show E 
the efficiency falls off more rapidly as p, and r, diverge, than for cases where the two pro 
cesses are of the same type, as is to be expected. cu 

Finally, Tables 2 and 3 show the correcting factors which may be applied to the ue. 
ficance points for the test of significance of a regression coefficient (based on 2) when E 
true and assumed error processes are first-order autoregressive. The meaning of these tab d 
may be seen from an example. Suppose that the t-test has been made when the error has t 
first serial correlation of 0-5 but when the estimation procedure has been based on M 
assumed first serial correlation of 0-7. Since 7, — 0-7, p, — 0-5, the true significance d 
lies between 1-22/? and 0-342, where t? is the tabular value—assuming, of course; that t 
approximation is exact. he 

In order to find the order of magnitude of the possible effects of serial correlation 0” 1 ji 
t-tests and to investigate the range of validity of the approximation, either — ¢ 
examples or the asymptotic approach of $5 must be considered. In the calculations sho 
in Table 7, only the extreme significance levels of t-tests using the tabular significance POY 
were found (for the case of only one regressor in addition to the mean). This avoids 4 
difficulty of inverse interpolation. Although Table 7 is not extensive, it shows that the » 
approximation is quite a useful guide. The exact results were found by using the meth 
Pitman & Robbins (1949), the calculations being performed by the EDSAC in the Ma s 
matical Laboratory, University of Cambridge. The second approximation there shown E 
obtained by making the first two small-sample moments agree for the case where there 
only one regressor apart from the mean. For this case the adjustment corresponding bi fail 
use of the f, and f, also makes the first two moments correct, asymptotically, So that "rion 
agreement is to be expected. The validity of the analysis corresponding to the asimi the 
that the errors follow a first-order process whose first serial correlation equals that 0 whe 
true error process, when in fact the true error process is of the second order, will n° gio? 
examined by considering the disturbances in the significance levels of the t-test of reg" y si 
coefficients. Let it first be supposed that the true errors follow a second-order autoregre? for 
process with A, = — 1-3 and A, = 0-3. This is the process suggested by Orcutt (108 jo 
Tinbergen’s series. Since Az < fj, the correlogram of this process does not have hes orjal 
oscillations. It will be seen further that it has a weak central tendency. The first ? ast 
correlation of the process is unity, so that the lower bound to the efficiency of à ymo 
Squares analysisiszero. Here we are supposing that a first-order autoregressive prann ww 
tion based on r, — 1 has been used, i.e. that the analysis has been carried out 0D " BS y» 
differences of the series involved. In this situation the lower bound to the efficiency eter i 
computed and is 0-70. In a sample of 20, the use of the second approximation for 


————— tM REP 


G. S. WarsoN AND E. J. HANNAN 443 


Bod. the iim in the significance level of the classical t-test at 5 % shows that the true 
i es evel may be almost as small as 1 % and as great as 16 %. As a second example, 
Es: — that the errors follow a second-order autoregressive process with A, = — 0-5 
oa : — The correlogram does not have a harmonic oscillation. If least squares had 
elem b Mee the efficiency lower bound would have been 0-02. However, if the 
E r ac ad been carried on the data after the use ofa first-order autoregressive transforma- 
ine au : py = —ATA)1- 0-83, the efficiency lower bound is raised to 0-54. In the 
em circumstances, the true significance level of a t-test made with the tabulated 5% 
iden TA range from 0-3 % to slightly greater than 20 95. These calculations confirm the 
be at better results are obtained in this case by the use of a first-order autoregressive 
bs sformation than without it, but they also show clearly that the performance may still 
poor even when the first serial correlation is known free of error. 
As a last example, consider an error process which is a first-order moving average with 
a = 0-6, while it is presumed to be a Markoff process with 7, = (1 +43) = 0-44, when 
0 observations are available. The ratios of the least and greatest roots to the mean root are 
0-330 and 1-49 respectively. Since the 5% point of ¿ with 18 degrees of freedom is 4-41 
the first approximation tells us that a test using this point may have any significance level 
between P( Rage A) and P( Pis? a „i.e. lie between 0-4 and 10-594. These prob- 
abilities agree well with 0:3 and 10:5 %, found by using the second approximation above. 
5. REGRESSION ON ANALYTIC FUNCTIONS 


& Rosenblatt (1954) consider regressor vectors x; (ortho- 


Grenander (1954) and Grenander 
trictions which effectively require the existence of the 


normal) which are subjected to res 
Mits, n—h 

s ry) = lim X tpn Yn (p, V=1, «+> k), 
no t=1 


and permit the representation 


Ry = 0] = | aM. 


ons with M(6) — M(0,) Hermitian non-negative definite 
is non-singular. Introducing N(0) = M(0-) - M(-0—) 
hown by Grenander & Rosenblatt that if N(0) has only 


ey M(0) is a matrix of functi 
m 0. >0,. It is presumed that Ro 
‘Sta is symmetric and real), it iss 
Sk points of increase 6,, .... 0, such that 
(i) dN(6) = [n,0)] = N; is non-null, 
(ii) N,N; = null matrix (i+J); 
=N; (i=j), 

timates of the regression coefficients which have, 


then the least-squares procedure gives es 
atrix as best linear unbiased estimates. These 


asymptotically, the same covariance m 
conditions are also shown to be necessary for this to be so. 
In fact it is easy to show (under fairly weak restriction on fo) 
8 
lim [x;aX,] = [£^ Ona] 


n0 


)* that 


- b FON; (531) 
; 


* In particular, f(w) must not have zeros at the points 0; (see Grenander, 1952, p. 568). 
p » f(e) P 


446 


Table 1. Lower bound to efficiency of estimates. Case p = 1, q = 0, p' = 


Serial correlation in regression analysis. II 


z. M Í 
: 0 0-1 0-2 03 04 0-5 0-6 0-7 0-8 0-9 
Pı 
man) 
| 1 
0 | 100 | 0:96 | 085 | 0-70 | 0-52 | 0-36 | 0-22 | 012 | 0-05 ee 
01 | 096 | 1-00 | 096 | O84 | O68 | O49 | O31 | O19 | O07 | OON 
02 | 085 | 096 | 100 | O96 | O83 | O64 | O43 | O24 | Ol | O^. 
0-3 0-70 | 0-84 | 0:96 | 100 | 0-95 | O81 | O58 | 0-35 | 0-16 e 
04 0-52 | 0-68 | 0-83 | 0-95 | 1:00 | 0-94 | 0-76 | 0-50 | 0-24 pes 
05 | 036 | O49 | O64 | Os1 | O94 | 1-00 | oo2 | 068 | 0:36 | OO 
0-6 0-22 | 0-31 | 0-43 | 0-58 | 0-76 | 0-92 | 100 | 0-89 | 0-55 s 
0-7 012 | O17 | O34 | O35 | 0:50 | O68 | O89 | 100 | O81 | OUO 
0-8 0-05 | 007 | O11 | O16 | 0-24 | O36 | 0-55 | O81 | 1:00 | MO 
09 | 0-1 | 0-02 | 0-02 | 0-04 | 0-06 | 0-09 | 0-16 | 0-30 | 0-60 
2 2 AB 
Table 2. n= far JG n) 
1-20, ri M — p, 
Ti 
0 01 0-2 0-3 0-4 0-5 0-6 
Pı 
0 1 0-80 | 062 | O45 | O31 = = 
0:1 1:22 | 1 0-8 | 0-58 | O4l | 0-27 E 
0-2 150 | 1:25 | 1 0-76 | 054 | 036 | 0-21 
0-3 186 | 158 | 129 | 1 0-73 | 0-49 | 0-30 
o4 | 233 | 203 | 170 | 138 | 1 0-69 | 0-42 
0-5 = 2-67 | 229 | 1-86 | 142 | 1 0-63 
0-6 — — 3-20 2-68 2-12 1:54 1 
0-7 = — — 4-14 3-40 2-58 1-74 
0-8 E — — = 6-23 5-00 3-00 
0-9 — — — = — 13-57 | 10-86 
—p2 2 
Table 3. f,— icai) c 
1—2p,70, 71/ M p 
Ti 
0 0-1 0-2 0:3 0:4 0-5 0-6 
Pı Y 
"m 
0 1 1:20 1:38 1:55 1:69 — = = mx £e 
01 082 | 1 118 | 134 | 148 | 1-60 = = = te 
0-2 0-67 | 0-83 | 1 116 | 131 | 1-43 | 152 z - es 
0:3 054 | 0-69 | O84 | 1 115 | 1-28 | 138 | 145 F Ze 
0-4 043 | 0-56 | 070 | O85 | 1 113 | 125 | 133 | 1:39 | i2 
0-5 — 044 | O57 | O71 | ose | 1 112 | 122 | 1:29 | 54 
0-6 cass EN 0-45 | 0-58 | 0-72 | 0-87 1 111 119 | j46 E 
0:7 = — — 045 | O58 | 0-72 | 0-87 | 1 1-10 | 5.08 f 
0-8 -— — — — 0-42 0-56 0-71 0-87 1 1 
0-9 = = = = = 034 | 0-48 | 0-66 | 085 


G. S. WATSON AND E. J. Hannan 447 


Table 4. Lower bound to efficiency of estimates. 


A, +A, 
Case p = 2, q = 0, p = 14 =0; A +s) «1 
4A, 
Parameters of the 
true error process ce 
1—pi\* True lower bound 
1+pi to the efficiency 
’ ^ | As 
“Y 
Ei: | 0-9 0-0030 0-0003 
-04 | 0-9 0-0888 0-0099 
—1:0 | 0-5 0-1476 0-0769 
—0:5 — 0-4 0-0327 0-0175 
—0:1 —0:8 0-3600 0-0122 
L 


bound to efficiency of estimates. Case p = 0, q —2, p' 2 0,q' =1. 


Table 5. Lower 
Bracketed figures are py and pg 


4 0-2 0-5 0-8 
D 
—0:5 0-17 0-00 0-19 
(0-08, — 0-39) (0-17, —0:33) (0:21, — 0:27) 
-032 0-76 0-43 0-00 
(0:15, — 0°19) (0-31, —0-16) (0-38, — 0-12) 
. 0:79 0-38 — 
sa (0-22, 0-19) (0-47, 0-16) 
0:5 0-28 — Tw 
(0-23, 0-39) 
0-00 a 
10 0-00 
(0-20. 0-49) (0-44, | 0-44) 
2:0 0:33 0-26 0-14 
(0-12, 0-40) (0:29, 0:38) (0-43, 0:36) 


Qase p 0,45 1,9 1,4 =9 


d to efficiency of estimates. 


Table 6. Lower boun 

r 2 0-2 0-2 0-2 0-4 0:4 0:4 ut 

3 e MP 08 0-8 0-2 0-4 0-6 v8 

Pi 918 0-34 0-44 0-49 0-19 0-34 0-44 “49 
Miniman. a0 agi 0:42 010 0-81 0-90 0-55 0-15 


448 


Serial correlation in regression analysis. II 


0-17 


0-76 


0:79 


0-28 


0-00 


cc Se 
0:33 


0-43 


0:38 


0-00 


0:26 


REN 
0-19 
0-00 


0-14 


Fig. 1. True and assumed correlograms from Table 5. 


Table 7. Bounds to the 5% significance level in t-tests 


Exact Fus Resend i 
Case (%) approximation | approximation 
^ (96) | (96) 
— - | = 

N=11 1-4 1 2 
P1=0°3, ri =0 14-6 15 16 
N=20 = 2 4 
Pı=0:7, 7, 20:5 21 20 20 
N=20 = 0-4 3 
p1790:9, r, 20:5 =s 20 20 


[ 449 ] 


MISCELLANEA 


Revi i 
ised upper percentage points of the extreme studentized deviate from the sample mean 
By H. A. DAVID 
University of Melbourne 
st value respectively in a random sample of n observations 


ard deviation c. The extreme studentized deviate from 
where s, is the usual root-mean-square estimator 


Let x 
rv x, and a, denote the largest and the smalle: 
wn from a normal population with stand 


tk A 
Hie py mean is defined as (tn — €)/s, or (z—2,)/5,; 
ased on v degrees of freedom and is independent of the numerator. Tables of upper and lower 5 and 


19 A 
i points and later also of 10, 2-5, 0-5 and 0-196 points were constructed by Nair (1948, 1952) for n — 3 
and selected values of v z 10. The former set of tables is reproduced as Table 26 in Pearson & Hartley's 


Biometrika Tables for Statisticians (1954). 
ane we give a revised and slightly extended table of the upper percentage points of (En —2)/8,. 
iip Should be accurate except for possible errors of one unit in the last figure shown. Comparison with 
i values makes it clear that the studentization procedure employed by him tends to become un- 
E actory for v < 20, as has been suspected by Pearson & Hartley (1954, p. 50, footnote). This effect 
tha ow the more pronounced as n increases and the significance level æ sharpens. For any given & 
303 p severe correction is generally needed in the case n = 9,» — 10; in place of Nair's values 2-54, 
5a +28, 3-67, 3:92, 4-40 for the upper 10, 5, 2-5, 1, 0-5, 0-1 % points we find 2-50, 2-89, 3-29, 3:82, 4-24, 
ximation of a type recently discussed by the 


marily on a simple appro 
+ of n deviations, x —%, we have the approxi- 


— . Since u, is the larges 
es of R (constant), 


Pr (u,/s, 7 R) =” Pr [(z—2)/s, 7 E]. 


a present table is based pri 
mes (David, 1956). Let un = tn 
e relation, valid for large valu 


% point of u,/8, is given approximately by 
[(n—1)/n} t, (m). (1) 
centage point of at variate with v degrees of freedom. As (1) slightly 
te values were corrected by reference to a 


ts the approxima 
ted by numerical integration. These exact values were 


I 
t follows that the upper 1002 


pde t(a.[n) is the upper 100a./n percen 

poc the true percentage poin 

obt, lework of exact percentage points construc 
ained as the solutions, R,, of the equation 


o un] Rx 
Í rf f(s,) ds, du, = % 
0 0 
bs’s (1950) table of 


differencing of Grub! 
al evaluated from tables of the incom- 


of Un, Was found by 
y = 10) the use of (1) led to an error of 


and the inner integr 
=0-10,n = 12, 


Her 
ne (un), the probability density function 
(END Pee distribution funetion of Un» 
onl e T-function, In the least favourable case (2 
nly 0.97, 
I am much indebted to Misses Betty Laby and Gwenda Jacobs for extensive assistance with the 


e 
?mputations, 
REFERENCES 


Davi, H. A. (1956). Biometrika, 43, 85- 
Ann. Math. Statist. 21, 27. 


Gruss, F. E. (1950). 

Narr, K. R. (1948). Biometrika, a 

N . R. (19 ` Biometrika, 39, 189. T 

demere Ec z e H. O. (1954). Biometrika Tables for Statisticians, 1. 
Cambridge University Press. 


Biom. 43 


| 
1 


Table 1. Upper percentage points of the extreme studentized deviate from 
the sample mean (x, —)/s, or (T —2,)/5, 


3 4 5 | 6 7 8 9 10 
10 % points 
L68 | 1-92 | 233 2-56 | 2-68 
166 | 190 | | 230 | 253 | 264 
165 | 1-88 | | 217 250 | 261 
163 | L86 | 2 246 | 2-47 | 2:58 
1-62 | 1-85 -0 2-14 245 | 256 
"E | | 
161 | 184 | 200 | 212 | 244 | 254 
161 | 183 | 199 | 211 242 | 252 
160 | 182 | 1-98 | 210 | > 241 | 251 
159 | 182 | 197 | 209 | 2 2.39 | 2:49 
1-59 | 1-81 | 196 | 208 | 2 238 | 92:48 
158 | 1-80 | 1:96 | 208 | 2 237 | 247 
L57 | 178 | L94 | 205 | 2 234 | 244 
1:55 | 1-77 | 192 | 203 | 2-12 232 | 24 
154 | 1-75 | 190 | 201 | 2-10 2.99 | 238 
1:52 1-73 187 1:98 2-07 2-14 2-20 2-26 2:35 
151 | 1-71 | 185 | 1-96 | 205 | 212 | 218 | 223 | 232 
150 | 1-70 | 183 | 1-94 | 202 | 209 | 215 | 220 | 228 
5% points =i 
201 | 227 | 246 | 260 | 272 | 281 | 289 | 296 | 3:08 
198 | 234 | 242 | 256 | 267 | 2-6 | 284 | 291 | 3:03 
L96 | 221 | 239 | 252 | 263 | 272 | 280 | 287 | 2:98 
L94 | 219 | 236 | 250 | 260 | 269 | 276 | 283 | 294 
L93 | 217 | 234 | 247 | 257 | 266 | 274 | 280 | 292 
191 | 215 | 232 | 245 | 255 | 264 | 271 | 277 | 2588 
190 | 214 | 231 | 243 | 253 | 262 | 269 | 275 | 286 
189 | 213 | 239 | 242 | 252 | 260 | 207 | 273 | 284 
L88 | 241 | 238 | 240 | 250 | 258 | 205 | 271 | 282 
187 | 211 |. 227 | 239 | 249 | 257 | 264 | 270 | 280 
1-87 2-10 2-26 2-38 2-47 2-56 2-63 2-68 278 
184 | 207 | 223 | 234 | 244 | 252 | 258 | 264 | 27 
182 | 204 | 230 | 231 | 240 | 248 | 204 | 200 | 209 
180 | 202 | 217 | 228 | 237 | 244 | 250 | 206 | ?0 
1-78 | 1:99 | 214 | 225 | 233 | 241 2-47 2-52 20 
L76 | L96 | 211 | 232 | 230 | 237 | 243 | 248 | 255 
174 | 194 | 208 | 218 | 227 | 233 | 239 | 244 | ?5 
2-5 % points é ere! 
234 | 263 | 283 | 208 | 310 | 320 | s29 | 3:36 | 349 
2-30 2-58 2-77 2-92 3-03 3-13 3.22 3:29 $56 
227 | 254 | 273 | 287 | 298 | 308 | 316 | 323 | 9359 
2.94 2-51 2-69 2-83 2-94 3-03 3-11 3:18 FF 
222 | 248 | 266 | 279 | 290 | 299 | 307 | 314  ? 
220 | 245 | 263 | 276 | 287 | 296 | 304 | 311 Ay 
218 | 243 | 261 | 274 | 284 | 293 | 301 | 908 | 3.15 
2-17 2-42 2-59 2-72 2-82 2-91 2-98 3:05 332 
215 | 240 | 257 | 270 | 280 | 289 | 296 | 302 | %10 
214 | 239 | 256 | 208 | 258 | 287 | 294 | 3:00 
213 | 237 | 254 | 267 | 277 | 285 | 292 2-98 a 
7 0 | 234 | 250 | 262 | 272 | 2:80 | 287 | 293 | 3.96 
“07 2-30 2-46 2-58 2-67 2-75 2-81 2-87 2.91 
2-04 2-27 2-42 2-53 2-62 2-70 2-76 2-82 
85 
2-01 2.23 2.38 2-49 2-58 2-65 2-71 2-76 2:8 
L98 | 220 | 234 | 245 | 253 | 260 | 266 | 271 2 
1-95 2-16 2-30 2-41 2-49 2-56 2-61 2-66 
EI demos 


29-2 


Table 1 (cont.) 


—1 
HHOO MOORS KAON Ox PRAIA BAGONG NONANO NHO 
= C ON ksi Oregano 5598 299 PERS SCO, IOS SAA SHHOS oro Yee aoe 
Hanah cach co coco DANN AN HHH ONN NNNNA NMN Dibba HH H N 
Oono eounndo r6 LOM MODDA DONO HIRSH BON 
S DEROS Tew hee SAT oe RAO ere POTO SA We oor omes SICH eoo 
SERED ENED NNNMNN (O00 0900 NMA HHH MANAN E ME HOD [IJ MEL DEMERRL ME 
ANH ONDAN OMNIA HHONID Omot RROM NLD 
Besse T799239 PAAS 929 ARSA PESOS does CREE SASS Wee QSD woe 
€» 05030909 ANNA MANM MAA HHH Mahan manh NNA WOH Aaa aides LU 
MAMIO CO M 0010. MONDA WH We noO O SMe H ngon rto 
Ves POON OMPR TLL TOSDE PUDE OTON ET ASGO AAAA not Guo 
mhmh hhh MNA NAG Hc commence MNA LL Wi HHH Hasta Bad MEL Ld 
3 2 
E 
E E E 2 
, I] © 
MaM BaNnao Hoat a NSCONGD oonan Aman odo! 
&, od men MONO ADEA ou E: SABES ioi POAN Woe B Cielo jepes eX Ome: ono 
xe hahahhah hanah MM AAA sg ETE TO MANN E P baada Heat aa ME I 
e 10 = 
^ — 5 ò 
DADA 4-00 wes bonot coru nouo cur 
$8958 zXZ$8 S592 RES SESS Taco SAAS OPP Soke cie oo SHED e 
hanh coche PAAA AAG Ccrcrcacs mecca MNN MIN HHH THs Hammer moo 
DEI aon enr Omo NEONO DONO ARO 
RN et SS Sarr Sow Bonao PANAN vinta “oot SASSI OD no: " COS ose eg 
on 05 09 N 08 AAAA AAG MANMANM NMA coco coc] AA HAHHA HANN NAN LI 
ecco MOAN ADN Onan 00 oorta LARE -ie ora 
ieee ED FOOD a0 i ISOS OG linea O UE ME SEA COIS OES EE) SO com xe 
NOANA AAAA AAA MMM MANAN AAAA AAA HHH eM MAMM MNN 
OARS MANOS t0 ouod ANNO HrMoH NODON SHO 
TEE ete, OURO MI IUD TOREO XOU EGET ESL SO a LO USE Quel. oum morsu enlm evel one 
AAAAA dành AAAA AAA «à Gan AAAA AAA Ahahaha Manna MaD AAA 
omnt nonon otoo oo Cunt nonon StSS oco SAMY nonon otoo oco 
AHAAA SARA RAMs 989 BRASS ARRAS RAST SAB Saana BEBE AASF Sas 
- 


452 Miscellanea 
Exact linear sequential tests for the mean of a normal distribution 


By J. TAYLOR 
Food Research Department, Unilever Lim ited 


1. INTRODUCTION 


istribution 

Wald (1947, chapter 7) described sequential probability ratio tests for the mean of a normal i 
of known variance. He gave formulae for their operating characteristics and mean nein mean 
recognizing that these were approximate, also set limits to these quantities. For tests in m first anó 
sample size is small, i.e. in which the alternative hypotheses differ greatly, the errors of t i orina! 
second kind are much less than the nominal values. The mean sample size is greater than : soap 
value and, although it is well beneath that of a fixed-sample-size test with the same actual er e S viates 
be close to that of such a test with the same nominal errors. Baker (1950) used random norma kind was 
to investigate the test for one pair of alternative hypotheses. For these, the error of the ien upper 
approximately the mean of Wald's limits and the mean sample size was approximately Wa 
limit. ù sracticable 

Sequential tests whose actual errors are close to pre-assigned values are desirable, and no F Iculating 
direct method of obtaining them has been proposed. It is suggested that they be obtained by S ENSE 
the errors of given tests and using interpolation. One way would be to use Baker's approxima E 
this would depend on results for only one pair of alternative hypotheses. Another method w this i$ 
to use an improved approximation to the operating characteristic, due to Page (19544, b) a 
outlined in § 3 of this paper. The present author has used a method based on random sampl à 
tests for which the errors of the first and second kind are both 0:05, and the results of his inves 
are presented in the remaining sections. 


tigation 


2. NOTATION the 
tt 


7, " TR -— ha 
We shall assume that z is normally distributed with mean 0 and standard deviation 1, and t 
alternative hypotheses are 

Hy: 0-260, 


Hy BD, 


The errors of the first and second kind are to be z and B. 


3. WaLD's TEST AND PAGE'S IMPROVED APPROXIMATION 


Wald's test consists of taking observations x; in sequence and plotting 


against m (see Fig. 1). The boundaries Lọ and L, are the lines 


1 B O,+9, 
= in (D 
is 0,—0, l-a B 
1 1-8 — 0,40 
and - tis ot 
B oes a=. neo 


respectively (In denotes a natural logarithm). ome dist a 
The approximations in Wald's formulae are due to the fact that many schemes end $ hoi r A 

beyond L, and L, instead of on those lines (i.e. they end on one of the full lines down from if eque pet 

one of the broken lines up from the B;, and not at points A; or B;). One consequence 18 the S which i 

(1) are used to calculate boundaries with, say, æ = fj = 0-05, the scheme will have error 

somewhat less than 0-05. 
Page (1954a) has shown how the true operating characteristic can be caleulated by ration 6° T 

integral equation, and (19545) has discussed methods of obtaining the solution. His no | 

s to ours if we take 
sponds to 0 = 4(0,-0,), 


453 


2 for a scheme 


a Miscellanea 
1d uso t] "T 
with b he probabilities (as given in his Table 2) for Z = $- These are the values ofa= 
oundaries 
Ot A 


and 
Ja rm. t 


of Z and may then be used to find the value of 0 for which 
"s table shows that, for Z = 5, we shall obtain 
values of Z would give a curve 


ld be investigated by taking Z + 1^. 


Tho s 

=, E Es calculated for a given value 

[LM eh ecce value. Thus interpolation in Page's tà 

Connecti 5 for 0 = 0-26, approximately- Similar calculations for other 
ing 0 and Z fora = f = 0-05. Tes 


a 


ts with & +f cou 


Fig. 1 Boundaries for a linear sequential scheme. 


4, AN ALTERNATIVE APPROACH 


An, values of & = for various values of Z. There is no 
at of determining the probability of 


oth, 

losg Ss approach is to fix 91 — 0o A the proble 

frossing V eiie if we tako mi pim a; are Ta normal deviates with mean 0 and standard devia- 

lon 1 e upper boundary W ne) hey that the lower (u pper) crossed at m = i. We can 

te Py and P,, from tables of the normal 

a 9. tl distribution of Vi aba n sad 

— ON is im : = e to the 
pranticabl. pun n used to onstruct 2000 sequen. 

M i atively large. 


5000 devia! 
ssed the upper boundary, and if 


first 20,000 deviates 


ces of ten values (y1 --- Y10)- These have 
d not cross & boun- 


n u; 

da Sed for several sets OF \ r ni 

Ty f. ted the remaining 
Or m «x 10 were comple ys U, of these finally cro 


Se 
u 
quences continued beyond” 


t] 
hen the estimated errors of the scheme are " 
= R= PutPutPey, 


and 5 

the standard error of ĝis Ue DL | 
zio NJ 

standard error Se» the estimated mean 


Algo if ‘ e N5 
Sa; lso if jj is the mean gam ple £s doen i > . H, is true) i5 
aw hole 


"'aple si 
size for the scheme 49 " 
= (P zd ausa onor 
Ith st mar o E l 
andard error Pese apquences rossing the boundaries for m = E 
i ulated values. 


m rti 
ed ay be noted that the propo” € UM the 
» Within limits of sampling on™ 


Wre 


454. Miscellanea 


5. RESULTS OBTAINED BY THE SAMPLING METHOD 


" C xed- 
Table 1 gives the principal results. In order that efficiencies can be compared, the sizes ny of fi 


x a next 
sample-size tests with the same c, J and 0, are given as calculated, without rounding-off to the 
highest integer. 


^ F : . Inter- 
For each value of 0, the probits of the values of £ are approximately linearly related to Z E 
polation gives the values p (in Table 2) for schemes having æ = f/ = 0-05. The standard errors 


va s : e same 
Z cannot be calculated, because the errors of the & for different schemes are not independent aep eal 
random sequences were used for all schemes). However, we may be confident that the standar 


Zi hemes 
the significance level for each Z is under 0-005 and so the true significance level for each of the wc. 
given will be between 0-04 and 0-06. This degree of approximation is much less than in Wald's 


Table 1. Characteristics of some linear sequential schemes 


E. 
Significance level Mean sample size 
| Efficiency 
A Z ny (yno) 
& S.E. fo. S.E. 
En. 
1-00 1-5 0-098 -005 . : dp 
8 0-0054 414 0-05 6-64 68 
2-00 0-0798 0-0052 471 0-06 7-91 173 
2-25 0-0647 0-0049 5-33 0-07 9-20 li 
2-50 0-0481 0-0044 6-03 0-08 11-07 : 
1-25 1-50 0-0776 0-0043 3-15 0-03 517 ue 
1-76 00593 0-0041 3-63 0-04 6-24 nae 
2-00 0-0439 0-0038 4-15 0-05 7-44 5 
. 3 I 
1:50 1-00 0-0863 0-0031 2-06 0-02 3:31 n 
1-25 0-0643 0-0032 2-44 0-02 4-11 1-75 
1-50 0-0477 0-0032 2-83 0-03 4:95 


Table 2. Characteristics of interpolated schemes 


R : sE 
Z-Z, No.1 n, Efücienoy ona 
(fno, 1) E 
= “I 0:32 
—047 | 5:93 1082 | 1.82 9:18 0-19 
—046 | 399 6-92 L7; 6-45 012 
—0-50 | 2-76 4-81 174 | 443 
Table 2 also gives values Zes ES 
calculated fr. Wi : we E p 
pos rom. ald’s equations (1) to correspond, for the given 0, and a nominal & = p= i "d 
men (2). The differences Z — Z,show no regular trend and have a mean of — 0°48, 5° 
‘at for other values of 0, in this range we should take 
2 Inlg 
f= — dg P 
The val hi E ! m 
ues 71, have been inte o Posi 
good. The mean sample size S pura pue Table 1. They show that tho efficiency of th 


" 3 : me 
? 11$ true is also important, and will be d glue 
1 Considered, 500 random Sequences were used to obtain t? 


Miscellanea 455 


Nmax, and their standard errors. It will be seen that they are less than ny. Thus these schemes may 
confidently be recommended, when they can be applied, as always having lower mean sample sizes 


than fixed-sample-size tests. 
6. TRUNCATION 


It is à disadvantage of sequential schemes that they occasionally require very large sample sizes, and 
it is sometimes recommended that sampling should stop at two or three times the expected mean sample 
Size. For the test given in this paper it would then be reasonable to accept H, or H, according to whether 
Ym was less than or greater than 470, (i.e. above or below the mid-point of 4,, B,,). This would give 


a tes i ^ 
est with a-f-8. 


say) would be less than n ;, but the increase in a and f reduces the size of 


The mean sample size (ny 
Table 3 shows the effect of truncating a particular linear scheme 


the equivalent fixed-sample-size test. 
at various values of m. 
Table 3. Effect of truncation for 0, — 1-00, Z — 2-50 


| 
E | : Efficiency 

m a (=A) | m Tis (n/n) 

6 0:1244 4-50 5.32 1-18 

9 0:0793 5.32 7:95 1:49 
12 0-0654 | 5-70 9-13 1-60 
15 0-0530 | 5-88 10-45 1:78 
oo | 0-0481 6:03 11:07 1:84 


Thus truncation below m = 15 causes an appreciable increase in & and loss in efficiency. Similar 
calculations have been made for the other values of 0, and Z shown in Table 1, and it has been found 
Benerally true that truncation at 2°5%,1 is satisfactory, but not at lower values of m. 
alternative rule for action at the point of truncation would be to accept H, if the upper boundary 
iad not been crossed, i.e. a path reaching any point on Am Ba would result in the acceptance of H,. 
This gives a guarantee that æ is less than 2, but # becomes greater than & and the efficiency is appreciably 
less than for the symmetrical truncation considered above. " 


7. SuMMARY 
ution with known standard deviation involves an approxi- 
ses differ greatly. The characteristics of some linear 
ted, and it is found that they have the high efficiency customary in 
Sequenti ili io tests. Some schemes for which the errors of the first and second kinds are 
05 ee pute atis ü runcation of such schemes has also been investigated and it has been 
Concluded that this sl id not be done at less than 2-5 times the mean sample size on the null hypothesis. 
Nie oris Kn c p a method, due to Page, of caleulating the operating characteristic of Wald 


B ing schemes with prescribed errors. 
Chemes, This provides another method of calculating p 


rmal distrib 


Wald's test for the mean of a no s 
Iternative hypothe: 


ape which is poor when the à 
equential tests have been calcula 


i imi "missi lish this paper. 
The aut] i f Unilever Limited for permission to pub p 
thanks the directors o o iD 

He also Bienes Miss R. M. Stimson, who did most of the computing involved. 
REFERENCES 
Baia; i in sequential analysis. Biometrika, 37, 334-46. 
Ee i P rtm poss vcn with the tetrachoric series and its generalization. 

id en . TO! l 
nes nme 52; pus ovement to Wald's approximation for some properties of sequential 
» E. S. (1954a). improv! 
A tests, J, R, Statist. Soc. B, 16, 136-9. 
Ta E. S. (1954b). The Monte Carlo so 
Wa 39. 414-25, : - 
Worn Ee (raat). Augus Analysis, YO Xxy. Random Normal 
MH, (1948), Tracts for Computers. No. A*"* 


Tess, 


lution of some integral equations. Proc. Camb. Phil. Soc. 


hn Wiley and Sons, Inc. 
* Deviates. Cambridge University 


458 Miscellanea 


the 
Fisher & Yates (1953, Table XX1) present S,, to 4 decimal places. These values were ea ees E 
values of a,, ,, given to 2 decimal places in Table XX, by squaring and adding. An upper boun 


2 "7 
absolute error of the tabulated S, is therefore* 2 x 0-005 x X |a,,, | ^ 0-008n (about), since 
: r=1 


lim 7 (a, annal) = Bz) 


no 


F or 
x being again a standardized normal variate. Thus S59, given as 47-3830 in the tables, may be in err 
by as much as 0-4, i.e. by somewhat less than 1%. -——À 
The values of the cs tabulated above allow the computing of S, for values of n from 1 to 1 i BE, 
with great precision by means of equation (14), the last term in the right-hand member of Lae RE. 
2; for n = 12 and n = 13. Upper bounds for the errors, induced by the errors of the tabula! on, Thus 
S, and S; are 6-1 x 10-5 and 6-6 x 10-5 respectively, and are very much less for lower values of n. 
the upper bounds for the errors in Sj) and Sı are 5-0 x 10-7 and 5:5 x 10-7 respectively. bo used. 
For values of n> 13 higher order c's are required if all the terms in the series of (14) are to as far 
However, this is not strictly necessary, since high accuracy may be obtained by including terms S 50. 
as that involving o;; only. Fisher & Yates's tables may thus be extended to cover values of rl de 
The error in S, now consists of a part due to the non-inclusion of terms involving a5, 015) ++» 2S W An 
(just as for n< 13) of a relatively small part due to the errors of the tabulated values of Qis z e or 
upper bound to the total error for S59 is 0-05, while for Shoo it is 0-10. More generally for moder 
high values of n it is about 0-1 % of the value of S. 


REFERENCES 
FisHER, R. A. & Yares, F. (1953). Statistical Tables 
4th ed. Edinburgh: Oliver and Boyd. 


RuBEN, H. (1954). On the moments of order st: 
metrika, 41, 200-27. 


n che 
for Biological, Agricultural and Medical Resta" 


ED s Bio- 
atistics in samples from normal populations. 


On the moments of the range and product moments of extreme 
order statistics in normal samples 


Bv H. RUBEN 


Statistical Laboratory, Manchester U. niversity 


1) 
n(n — 1) LP) — F(u)]"-2 f(x) fv) (v>u), ( 
1 m c (2 
where fu) Js à ; 
t re (3 
F(u)- ET 
= qa [^ eiae 


22s. " F dads 
* Actually, it is very slightly less for odd n, since a, In İS accurately zero when 7 = d(n4- 1) and” i39. 


Miscellanea 459 


Hence, for n even, 


E(uv) = na-n fT UF(v) — F(u)]?"2f(u)f(v) uedude 


vu 


= ma- f” LE (F(x) — F(u)]*?f(u)f(v) uedudv 


= ma- f” i iS -¥("Z) [F(u)]* LE (ev) ]"-2-# f(a) f(v) uvdudv 


c k=0 k 


= ma-n Ecrit pP "Foot fondu [^ v[F(o)]"7?-*/(v) dv 
k=0 eo æ 


1 n-2 »—2 
- an” 1 X (-)* ( L ) k(n—k—2)u, ,(c057 — 3) un, 3(c0s71—3) (n = 2, 4, 6,...), 
(4) 
Where U,,(0) is the ratio of the content of a regular (hyper) spherical simplex, with primary angles 0, to 
the content of the surface of a sphere immersed in m-dimensional space, the simplex being constructed 
9n the surface of such a sphere. In the derivation of equation (4), use has been made of the result (Ruben, 
1954) that the expectation of the upper extreme statistic in a normal sample of size t is 
co t(t— 1) 
t J uP flu du = Su, (cos*= 3) (091,2... (5) 
=o SAT 


The relationship between the present u’s and the Ws of the previous paper is u,,(cos-! — 1/x) = u, (a). 
Equation (4) may be written in slightly simpler form, as follows: 


E(uv) = men) is (-)* (" a i k(n—2— k) up- (c0571 — 3) «4, .5(cos1— 3) 
7 k=l * 
EIUS Cog cay eom - Dae 
T k=1 TERR, d 
NEL C 1) s= DS (=): iaa wj(cos71—3)us-,-.(00571—3) (n = 2,4,...). 
7T k=0 


(6) 
The terms in the series of equation (6) corresponding to k and to n—4—k are equal so that yet another 
Version of equation (4) more adapted to computation is 


= 4—93)dn73 Em 
E(uv) = ainsi mecs) à (—)* Cs ) us (cos7! — 3) u, .,..,(cos-1 — 1) 
-1 =a 2 - ES 
-i Nw (sn) n—D*uj, seo 174) (n= 2,4,...). (n) 


A Similar argument may be used to derive E(v — u)”, the moments of the range w, based on a sample of 
Size n from normal populations, when m and n are both odd and when m and n are both even, since the 
integrands defining these moments will then be symmetrical in « and vas in the derivation of equation (4). 
The moments themselves may then be expressed as sums of products of contents of simplices whose 
Angles are either 0 or 7—0 where 


1 
0 = cos-(—1), cos“ —}), «+. cos (i4 5) j 


Wo Shall not derive the relevant expressions here but shall instead use (4) to obtain the second moment: 


nly of w, for even n. We have 
Bw) = E(v—u) 

= 2E(v?) — 2H (uv) 
n(n — 1) (n— 2) 


= 24 27 3 Qin s (cos! — 1) 
4071072) 8-3) ^s "| wj(cos-1—3)w,.,.(c0871—3) (n 2,4,...) (8) 
4n k=0 k 


on using equation (101^) ofan earlier paper (Ruben, 1954) to express E(»?) in the form used in equation (8). 


460 Miscellanea 


It should be noted incidentally that for all n 
E(w,) = E(v—u) 

= 2E(v) 
_ n(n—1) 

aia 
ing equation (101^) referred to earlier. "m 
nane (8) and (9) may be used to devise explicit expressions for the variances of Was A ra. 
terms of elementary trigonometric functions. Similar expressions for the variance of w,, vian ei RECA 
are not possible, since the formulae for these variances involve the contents of spherica lari 

pentahedra, etc., and these cannot be expressed in terms of the elementary trigonometric 


d «nd in the desired 
(cf. Ruben, 1954, 1956). On the other hand, E(uj), as opposed to var w,, can be expressed in the 
form. In fact, 


9) 
Upe(cos?— 3) (n = 1,2,...), ‘ 


E(w.) = fol Uo(cos—? — 1) 


m 
p (10) 
- T 
Elw) = Trusloos-i— d) 
. 12 cos-1—} a) 
“in on”? 
30 (12) 
E(w) = —u,(cos-!— 3), 
6. Jm 4 (13) 
E(w) = 2, 


24 24 
E(w?) = 2 tar us(cos-t — }) + P ug(cos-1— 1) 


6 6 (14) 
= rte 


2 120 
Blei Pg, S07 p H E (ou qcos-1 J) ts(cos-1 — 4) — 2ud(cos-t — 3 


120 3(cos?i—1)—7 360 cos-1— 
ha ae Far i24») 
15 45 15 (15) 
uds cu T qi (3 cos- (— 3) 6 os-1(— 3), 
10) 
varw = 2—7, 
6 2/3 36 17) 
varw = 2404299 Pega ayy t 


U,(x) and V, (x) (the latter 
(Ruben, 1954)), at least as far as n = 100, x 


It is also hoped shortly 
statistics for normal sampl 


REFERENCES 
RuBEN, H. (1954). On the moments of order statistics in samples from normal populations: 
metrika, 41, 200. 
RusEN, H. ( 


j0* 


1956). On the sum of Squares of normal scores, Biometrika, 43, 456. 


Miscellanea 461 


On estimating binomial response relations 


By F. J. ANSCOMBE 
Statistical Laboratory, University of Cambridge* 


Berkson, in numerous papers, has rightly stressed the preferability of assuming a logistic form of dose- 
response law to assuming an integrated-normal law: (i) the logistic law has, in some applications at least, 
Some sort of theoretical justification, while the integrated-normal law has none, apart from a wholly 
Speculative argument about a distribution of individual tolerances; (ii) the shapes of the two types of 
response curve are so similar that only a very extensive series of observations could hope to show any 
appreciable difference in goodness of fit; (iii) simple sufficient statistics are available for the parameters 
of the logistic law. 

The objects of this note are (a) to point out that fitting a logistic law by maximum likelihood is not 
as laborious as Finney, in several publications, has made it out to be, and (b) to examine Berkson's 
method of fitting by ‘minimum logit y*", suggesting a small modification of it. Remarks are also made 
tia testing the adequacy of the assumed logistic form of law, and about fitting the ‘angular’ response 
aw. 


1. NOTATION 


Groups of subjects are tested at each of k different values x; of a predetermined variable (i = 1, 2, ..., k). 
In toxicology, x; would be the ith dose expressed logarithmically; and it will be convenient here to 
refer to tho z's as ‘doses’. Of the n; subjects tested at dose z;, r; are observed to respond. It is assumed 
that the r; have independent binomial distributions with parameters n; and P; (= 1—Q;), where P, is 
a given function of unknown parameters @ and f requiring estimation. The logistic law is that 


P, = [1c exp( - & — px), 
or In (P,/Q;) = «+ pti 
It is also convenient to write «+ fx; = f(v;— 4), so that u (= —a/f) is the dose for which P = 3. 


Often there are two or more such sets of observations, and then it may be appropriate to assume that 
the parameter f is the same for each set. For simplicity of discussion, however, it will be supposed now 


that only one series of observations is being considered. 
Since all symbols for quantities other than the parameters «, J, j carry the suffix i, it will be convenient: 
to omit the suffix, noting that all summations are over? = 1,2, ..., k. Estimates of « and f will be denoted 


by Gand B „and values of P and Q obtained from (1) when « and f are replaced by & and B will be denoted 


by P ana Q. 


(1) 


2. FITTING THE LOGISTIC LAW BY MAXIMUM LIKELIHOOD 


Since Er and Era are sufficient statistics and simple to calculate, it seems reasonable to use them. The 


Maximum-likelihood equations for & and f are (as Berkson has pointed out) 


Xr- Ya, Ire = ZnzP. (2) 
" ^ 
Thoy can be solved iteratively as follows (cf. Cornfield, 1954). Given trial values of & and p, calculate 
the right-hand sides of (2) and also the estimated weights W defined by 
* A^ 
W =nPQ. (3) 
Corrections 8 and of to the trial values are then found by solving the equations 
er md 
(ZW) da + (ZWa) 9f 7 mP, o - 


(X92) 98 4- (E Was) 8 = Xrz— EnzP. J 


Tn further iterations it is unnecessary to recalculate the coefficients on the left -hand sides of (4), but only 
the right-hand sides. This procedure is to be compared with Finney’s (e.g. 1952), which is exactly modelled 
On the ingenious procedure due to Fisher and Bliss for fitting the integrated-normal law, for which no 


Such sufficient statistics are available. p . 
Under the conditions that all the n’s are equal, and that the ^s are equally and not too widely spaced, 


and cover such a wide range that PQ may be assumed effectively to vanish outside it, the sums on the 


* Présent address: Department of Mathematics, Princeton University. 


462 Miscellanea 
i After 
right-hand sides of (2) may be replaced by integrals, with the aid of the Euler-Maclaurin formula. A! 
integration by parts we obtain M 
ii H= x, M — hEr[n, 1 (8) 
p F n* (352) = (zy 3A)*— quaa - 2hUrx/n, I 


where v, is the greatest dose and h is the 
the Spearman—Karber formula for estim. 


Mantel (1950). The left-hand side of the 


ions is 
spacing between adjacent doses. The first of these d n 
ating 4, the full efficiency of which was noted by Corn 


second equation is approximately 2? + 3-290/f?. 


3. ‘MINIMUM LOGIT ys 


* (Berkson, 1953) 
For graphical treatment it is natural 


NI IA A (7) 
EXWQ-&Z- fy, 
where the W's are empirical weights defined by 


(8) 
W= r(n —r)In. 


5 b; 
Now as an estimate of æ + fæ the | defined by (6) is biased, but the bias can be very nearly removed 7 
redefining J thus: 9) 
i-a ^fi. ( 
n=r+}' 


Moments of the asymptotic distribution of this | may be found by 
(9) as a Taylor series abou 


ide of 
expanding the right-hand sid 
t the value nP for r, raisi 


2 1 rm 
ng it to a power, and then taking expectations te 
term. We find in this way ; p "Dee d 
= 10) 
&()) = G4 Be+O(n-2), varja, y,ü). XP UR ( 
n. A(PQ) 
The last of these resu i 


1 
+6(l-a—fr) = ebore 3+7 (nPQ)'In 5+5 (nPQy!In 7 4- z —1n (2nPQ), 
where the Upper sign on the left-hand side refers to P 5.0, t 


var (l) = e-»PQ [eros 3)2+ 


ile 
ni & = $ or less, the mean Square picante sd. veg AC oe ee s ee rote 
i à ess than he r 
are two or more doses for which nP Q <1, the observations eel a nre aia goodift "M 
given positive weights in the least-squares estimation of a and B (6) 
It seems to me advantageous to try to eliminate the bias inl by using the definition (9) rather than ys 
if the method of least Squares is to be used since that leads us to calculate (weighted) sums 0 d ihe 
Buppose there are only two doses, and the corresponding values of nPQ etui 2 Then prov? 


———— M—€ 


| 


Miscellanea 463 


weights W that are assigned are positive, their values are immaterial. The fitted regression line passes 
through the points (25, /,) and (xq, 1), and & and B are unbiased. If now there are three such doses, roughly 
equally spaced, the fitted regression line will depend on the weights, but not much; it will make little 
difference whether empirical weights (8) are used, or fitted weights (3). 
x The precise definition of the weight function (so long as it approximates to nPQ) only begins to matter 
if (a) there are several doses for which either P or Q is very small, or (b) there are a considerable number 
of closely spaced doses for which neither P nor Q is very small. If (a), the bias in J when nPQ is small 
will matter if W +0; if (b), a purely empirical weight such as (8) will introduce a bias, the extent of which 
Will depend on how close the doses are. It seems likely that no purely empirical weight function such as 
(8) can be satisfactory for all possible sets of doses, though it may well be that the procedure explained 
mM detail by Berkson (1953) (and not fully explained here) is satisfactory for all ordinary use where the 
doses are roughly equally spaced and neither too close nor too far apart. 

To guard against the effect of unusual spacing of doses, I have formulated the following recom- 
Mendation (or rather, two alternative recommendations), which should yield estimates of x and £ 


that aro almost unbiased and of minimum variance. 


Table 1. Moments of the distribution of l defined by (9) 


nPQ Magnitude of bias Variance Mean-square error 
2 0-026 0-473 0-474 
42 0-078 0:515 0:521 
1 0-169 0-505 0-534 
4 0-484 0-385 0-619 
$ 0-050 0-240 1144 


's METHOD. Defining l by (9), minimize the sum of squares (7), 


SUGGESTED MODIFICATION OF BERKSON ( e 
by a. fitted weight W equal to nPQ if that exceeds 1 and 


Where either W is defined by (8) or W is replaced 


equal to 0 otherwise. us j 
The empirica] weight (8) would be used in cases where it is judged that the weights do not matter much 


anyway, as just explained. Otherwise, the fitted weight would be used; that involves the possibility of 
urther iterations, but the process will converge quicker than the method of maximum likelihood because 


the iteration only concerns the weights, not the ordinates. The suggested critical value of 1 for nPQ 


15; of course rough-and-ready—any value between 2 and 3 might be satisfactory. Unless there are at 
east two oses for which positive weights may confidently be assigned, the method must be considered 


© have broken down, and no estimates should be derived. À p l 
Whether the above procedure is preferable to the method of maximum likelihood is a question that 


Cannot be answered without refere Granted the assumptions of independent 
inomial observations and a logistic response law, the fullest possible summary of the information 
Contained in the observations is provided by the whole likelihood function. For some purposes, it 


Suffices to quote merely the position of the m 


erivatives there. Berkson's procedure may be valuable as as j i y 
ince, however, the assumptions cannot be asserted absolutely, a doubt concerning their appropriateness 


must always lurk; and graphical presentation of the data by plotting 7 against w has value as a check, 
&part from any approximation to the maximum-likelihood estimates. 


4, TESTING GOODNESS OF FIT 
When the question of goodness of fit of an assumed response law arises, it has been the practice to 
calculate a goodness-of-fit x? (e.g. Armitage & Allen, 1950). This is rather unsatisfactory, since y? will 
: 1 distribution of responses at any dose, and often in practice 


be sensitive to departures from the binomial tribution 
one cannot be confident that the binomial distribution is very close to the truth. In any case, y? does 


not indicate the direction of departure. L5 ` 
A botter procedure, for testing the adequacy of the logistic law, would be to estimate parameters Jg 


and //,, assumed small, where 


In (P/Q) = Pale) + Bale — PY Bol — BY. (13) 


464 Miscellanea 


* i i isti > Erz? and Erz? 
The aj priate statistics, besides Xr and Erv already used in fitting the logistic law, are X 
e appro > 


simi i ing interesting 
ith Wii i h chance of detecting anything inte d 
ec ssions with WI instead of r). There is not much c | I aside] 
a Eeo ae number of series of observations (in the vues pe — — 
Ifo i i iti ioned above for the Spearman- "m 
s the doses satisfy the conditions mentione ; ie uate P wl 
i Sta cite anode the condition of covering a wide range. It is then easy to estimate £. 
ore i ii R 1 ight 
3 ations similar to (5). . K— 
or o, tus reu curve of P against z is not antisymmetrical about vs pu. Th rece TES 
b sells a by a transformation of the z-scale. If f, = 0 and fj; +0, the logistic law NT eatimatol of f. 
pee ossibly one of the other suggested laws might be (see Finney, 1952, chap. I). ii e «c particular 
nd f are available from a number of assays, their signs, rather than their values, wi 
a 3 
interest. 


5. FITTING THE ANGULAR LAW 


The only other law besides the logistic which seems at present to be worth considering is 


P=sin?(a+fx) if PRISON] (14) 
=0 otherwise. 


as for 
Here the weight is constant (and positive 


) for a + Px in (0, 12) and zero elsewhere. For this T ptotio- 
the logistic, the transform of r by the inverse response function can be modified so that i is a corre- 
ally unbiased, whatever the value of æ+ fx in the range (0, jm). The modified transforma 


sponding to (9) above, is (Anscombe, 1954) 


scant 54 (15) 
y — sin J 
Moments, corresponding to (10), are 
1 P-Q (16) 
Ely) = a+ fx -- O(n-3), var(y)~ =, han spay" 


er roximately 
The device of adding equal constants to r and a —r so that the transformed quantity is app ematical 
unbiased, for all z in an appropriate range, is possible only for response laws of a certain m& 


: — p) The 
type, such that x can be expressed as an integral with respect to P of a negative power of P(1 ) 
device is not therefore available for the integrated-normal response law. à 
li 
r ornfie 
Ihave taken advantage of helpful comments by Dr P. Armitage, Dr J. Berkson and Dr J. © 
in revising the draft of this paper, 
n why 
Added in proof. In $3 above it i 


5 pointed out that (i) there seems to be no particular reaso 
the transformed quantity | should be defined by (6) rather ig is large the 
such as (9), (ii) the properties of Z can be, and need to be, investigated, and (iii) if n is inmum 
definition (9) is superior to (6), and seems satisfactory. Nothing has been said about the wi pardly 
logit x? method when n is small; the results given might be expected to apply if n= 50, bu oin 

ifn=5. Even if n is large, (9) is not the best possible definition; Professor J. W. Tukey has P 


on 
t ress! 
than by some less obvious exp 


racting 
out that (9) can be improved by adding 1 to the right-hand side of (9) when r=n, and sub erioa] 
$ when r= 0, so that the range of values assumed by l is widened a little. No doubt by ™ 

study it would be possible 


to improve substantially on (9) when n is small. 


REFERENCES 
ANSCOMBE, F, J. (1954). Comments on a 
ARMITAGE, P. & ALLEN, I. (1950). Metho 
Camb., 48, 298-322. 
BERKSON, J. (1953). 
with quantal resp 
CORNFIELD, J. (1954), 
and Mathematics in Biol s 
CORNFIELD, J. & MANTEL, N. (1950). 
the calculation of the do: 


paper by R. A. Fisher, Biometrics, 10, 141-4. J. Hyd 
ds of estimating the LD 50 in quantal response data. 


le c————— —————————— m —mÀ—À 


Miscellanea 465 


Existence and uniqueness of a uniformly most powerful randomized 
unbiased test for the binomial 


By A. A. BLANK 
Department of Mathematics, University of Tennessee 


INTRODUCTION 


Tocher (1950) applied the Neyman-Pearson theory of testing to discrete variables. In an example he 
Pointed out the fact that among inbiased tests for the binomial those which are most powerful possess 
a certain special form. The questions of uniqueness and existence were left open. 

Let 0 denote the probability of success in a binomial trial. The problem is to test the hypothesis 
G=p € (0, 1) against the alternative 0+p by performing a sequence of n binomial trials. Let X be a 
random variable denoting the number of successes in the n trials. A test is equivalent to the choice of an 
acceptance criterion 0<y(X)<1 defined so that the hypothesis 0 = p is accepted with probability 
y, V'(v) whenever v is the number of successes (v = 0,1,2,...,2). The probability of accepting the 

Ypothesis in the event of its truth is then 


L 


iM 


y,b(p) = 1-a, (1) 


v: 
where b,(5) is the binomial term () p'(1— p)?" and a is the size of the test. Such a test is said to be 


unbiased if the probability of accepting the hypothesis when false is less than the probability of accept- 
ance when true, that is, 
n 
È y,b(0)<1-a (0«0x1) (2) 
v=0 


A test is said to be uniformly most powerful if simultaneously for all alternatives 0+ the probability 
of falsely accepting the hypothesis is minimized. Tocher showed that if a uniformly most powerful 


test exists among unbiased size æ tests it must have the form 

y,-1 for s+ Ixrx«t-1, 

y.-06 yu d, (3) 

y,=0 otherwise. 
Tt win be shown that among tests of this form there exists a unique unbiased test. Since Tocher’s 
analysis may be applied directly to show this is most powerful among unbiased tests the proof will then 

? complete. 
PROOF OF EXISTENCE AND UNIQUENESS 
Let us assume that a test (3) is given. We seek a value 0 = p which minimizes the size x and maximizes 
© acceptance function ia 
n S 
P(0) = Xy b, (0) = cb4(0) +db(0) + Z b0) 
0 s 


In the interval 00 <1, where we assume 
0<c, d<l, 
O0<s<t<n. 


The effect of the first assumption is to guarantee that the terms of index s and ¢ are present. The second 


I3 not restrictive since the proof is direct when s = ¢. 


We have P(6)= ÉD aL 
1 
- gag Se Th 
Where S = .s(1—0)-+(1—c) (0—5), 
T —1(1—0)—d(t—n0) 
= t(1— d) 4- 0(nd —t). 
30 Biom. 43 


466 Miscellanea 


" - i8 lies t interval J = (s/n, t/n): 
— 7 ish only if (i) 0 = 0, (ii) 0 = 1, or (iii) @ lies in the open in P ust | 
E breed nd i i c pied D c— d = 1, P(0)z 1 and the case is trivial. Otherwise p ™ 

nsi mh. 


i i bounded. & 
satisfy (p/(1—2))" = (1— c)/d, which has a unique solution since 0/(1— 0) is monotone and un 
A unique extremum exists. 


i sentation of | 
If not, both s = 0, t = n, it may be seen that S is positive for 0 e I. From the second represe: 
T abaroa is easily seen that T, also, is positive. It follows that p satisfies the equation 


" Tb, /— 
ogg B 
The left-hand side of this equation is monotone increasing in 0 since 


á thi Ts z(e i=- 
sasia | 85 d0V ^75) 7 Ga —0) "E. 
` <trer 
The existence and uniqueness of an extremum of P(0) in I has been proved. Further, at the extr 


Pp) = ASZ eo 
Pp LS T pq H 


(S mw X * 3 
since S| = 


0. 


= ---<~—, 
P q pq 
where q— 1— p, we conclude that the extremum is a maximum. ur at 0-7 0 
In the open interval (0, 1) there is at most one local maximum. Ifa maximum should occ ould have 
there could be no other local maximum, otherwise the open interval between the maxima w 
to contain a minimum. The same result would hold should a. maximum occur at 9 = 1. 
Tn all cases there is a unique maximum of P(0). 


All that remains is to prove uniqueness of the test subject to the conditions: 


(a) P(p)=1-a, 
(b) Pp) =0, 


(c) P'"(p)«0. jsn 
Omitting, for the moment, the condition (a) on the size of the test, we find from (b) and (c) that p 
increasing function of s* = s+ 1—c and of t* =¢—] +d, independently, since 


£p ep | np—s 


Given p (0 «p — 1), the condition (a) determines ¢* uniquely as an increasing function of s*. Th 
of 0 which maximizes P(0) i 


"glue 
3 i i e the valt i 
Increasing with s* and cannot assum! 


for all p in the open interval. 20, Fo 
The special cases p — 0, p — 1, are treated differently. It is sufficient to consider the case P 18 value | 
this case there is no unique size « test. We must have s* = a and t* may take any allowab ng thes? | 
Unbiasedness is assured by taking yı «1—a. However, the uniformly most powerful test amo | 
is given by y = a, y, = 0 (520). A similar criterion exists for p — 1. 


o value 


CoxrIDENCE ESTIMATES 
Itis well known that the Neyman-Pearson method of 
The test described above leads t. 


n 
imatio? 

testing may also be used as a means of est?" js 

only necessary to tabulate s* 


E 
o a confidence interval estimate of P at the confidence level 1 6 observe 
obtain the desired confidence estimate W a 
po 


an 
mial trials and a value x of a random variable which has & To mits 
distribution in the interval 0<x<1. We obtain the respective lower and upper confidence 
and p, by locating in the table tho values satisfying 
(po) = v—z, 8*(p,) = va. 
REFERENCE ated 
Tocuer, K. D. (1950). Extension of th 


" vari 
E e Neyman-Pearson theory of tests to discontinuous 
Biometrika, 37, 130. 


Miscellanea 467 


A note on the circular multivariate distribution 


By G. S. WATSON 
The Australian National University, Canberra, A.C.T. 


l. Summary. In Anderson's (1941) paper on the distribution of serial correlation coefficients, Hotel- 
f these coefficients appears. Since then this device has been 


used constantly to lighten the mathematical difficulties in serial correlation theory. There has, however, 
been no suggestion that the model would ever be a good approximation in practice—the reverse has 
always been suggested in fact. Tt is the purpose of the present note to point out a fairly common class of 
data in which the circular model will be a good approximation. A typical case of this occurs when one 
Wishes to study seasonal variation and begins by averaging the data over many years. 


ling's suggestion of a circular definition o 


2. Periodically averaged data. Let {x;} be generated by a stationary process with, for simplicity, zero 
mean and unit variance and a correlation function p, — E(z;z;), where s = | i-j |. 

Suppose Nn consecutive z's are observed and define 

yj tUadectEuae-oa (j=l, eun). 
Then, with |i-j| = 5, 
var(y) = N+2(N — 1) pat UN 7 2) Pont t P-n? 
covar (ya y) = Np, (N — 1) (Pn-s + Pn+s) + (N — 2) (Pon-s+ Dans) +. + (Pua) ns HAN- ns) 
e 


1 N 
and pet i= x) (Pn-s Tsai) ( UN ) (Pu) n-s + PW» n42) 


corr. (yo yj) = 1 2 AN 
142(1-5) me -X)eer-t- oe 


When N-o 


1 F yp Pat Pnet Pant 2E POLL) n-s FAN- nes 
corr. (Yis Ys 1--20, + 2Pent---+Pw-vn 


If in this expression s is replaced by n—s, the expression is only altered by the addition of 
PN, ,— py. 54, to the numerator. This additional term will usually be negligible in practice. 

Thus, if N is sufficiently large, the variables y; (j = Ll...) will have zero means, constant variances, 
and the correlation of y; and y; will be the same for |li-i|-25 (s =1,...,n—1) as for |i-j| 2 n—s. 

is is the most general case of the circular multivariate distribution. 

Further, if the basic process is such that p, may be neglected for rz n, then 


corr. (Yis Yi) = Ps t Pn-s li-l = 8: 


1f, for example, the basic process is a first-order autoregressive process, with autocorrelation function 


Ps = p*, we find that 
corr. (yo Y) = p> +P" 
LE 

1+p" ’ 
ible. This last correlation function is that of the first-order 
among others, Anderson & Anderson (1950). 


Since wo are assuming that p" (r >n) is neglig 
circular autoregressive process considered by, 
REFERENCES 


Anprrson, R. L. (1941). Distribution of the serial correlation coefficient. Ann. Math. Statist. 13, 1. 
DERSON, R. L. & ANDERSON, T. W. (1950). Distribution of the serial correlation coefficient for 


residuals from a fitted Fourier series. Ann. Math. Statist. 21, 59. 


30-2 


468 Miscellanea 


The fitting of regression curves with autocorrelated data 


By N. A. HUTTLY 
Marconi's Wireless Telegraph Company, Chelmsford 


1. INTRODUCTION 


radar 

This investigation was prompted by a study of problems dealing with TRAEN i pues 

signals from a moving target, one such measurement being the difference in phas a ios tB 

received at two different aerials from the same target. In this part icular problem the ou s nati 

radar was a continuous wave form, so that the data under investigation was also of a peciam 

From physical considerations of this problem (e.g. the assumption that the target. had x wie 

it was expected that the phase difference would vary linearly with time. Thus the statistical i aia. di 

was concerned with the determination of this linear trend, that is, the problem gren ^e ee is 
of fitting a regression line to continuously varying data. In order to deal with data of thi 


3 E 1 st be auto- 
necessary to have recourse to the theory of autocorrelation, for all continuous data must 
correlated. 


This paper is a short investigation into the ge 


he 
a r P . : when t 
neral principles involved in regression fitting 

data involved is autocorrelated. 


2. GENERAL DISCUSSION OF THE PROBLEM 


es and the 
In the classical theory of regression for random normal variables the method of least squares anc 


: $ : his is nO 
method of maximum likelihood are equivalent, but for stochastically dependent variables thi . 


tie- 
longer, in general, true. Inthe case of largesamplesit has beenshown that the two methods are ange Oe 
ally equivalent (Grenander, 1954; Grenander & Rosenblatt, 1954; Wold, 1950) for stationary S d a smal 
processes. We shall, in this pay al’ samples where we ie o total 
T of independent points is small* although th 


tin 
on- 
ices 


ed form of stochastic dependence, namely, me 
priori. This case very often occurs in pu 
d by virtue of the Wiener-Khintchine relation ses 
f the receiver. Wehave confined ourselves to the ess 
SS j ds could easily be extended to higher-order reg" 
curves. 


In $83 and 4 we develop the general results apper| 


approaches, and in $5 we utilize these results for 
Markov process. 


likelihood 


taining to the least-squares and a MES ]inea* 


a special form of stochastic dependence, 


3. LEAST-SQUARES SOLUTION 
We may formulate as our model 


Y(t) = e Bt ye... + e(t), 
where y(t) is the observed variable; a, Ê: y. the constants of th 
from the data; e(t) is a stationary random variable. 

Without loss of generality we take c(t) to have 
P(T). We also take, for convenience, the period of 


d 
: nate 

y 5 stimê 

e regression curve, which are to be € 


;on functio” 
zero mean, variance g? and autocorrelation 
observation to be 


—-T<t<+T, 


3-1. Linear regression 


(D 
Let us consider yu) = a+ fua- e(t). 
Since we sample in the range (—T'«t« +T) wo have 
E (2) 
| vdt=0 (r odd). 
-T 


* The sample may be discrete or continuous. 
t By equivalent number of points we 


r K mean that number of inde 
same amount of information as the samp 


B ives 
pendent points which £ 
le considered, 


Le 


EE C — ÉD" Á 


Miscellanea 
In accordance with the general theory of least squares we form 
T 
S= i (y(t)—a— porate, 
=r 


as _ as _ 


and solve the equations aan ap x3 


"This procedure leads us to the two equations 


2 T 

= (J? voa) fen 
-T 

A Ux 

p=a({ moa) fem 
-T 


469 


(3) 


(4) 


(by virtue of equation (2)). Since &, B are functions of y(t) and therefore of e(t) they are random variables, 


therefore we can form their means and variances. From (1) we have 


Ely(t)} = a+ fe. 


Therefore ed}=a, &p)=B, 
T a 
and var (ĉ) = elf KC a) / (47%), 
A T 2 
var (f) = zu EL Ja. 
T Jg 
From (5) 4T? var (&) — e| Í NM 
and on putting T-l-l, 
singe Eltelt) e(t-+7)} = o*p(r). 
" T-t 
We have 4T? var (2) = c? | pty o dr, 


and we roverso the order of integration so that 


T T-t 2T T-T 
| af dr ef af dt, 
-T J-T-t 0 -T 


?T 
and finally, var (2) = o? ( J à (277 — 7) p(T) ar) / (27?) 


sad var (fj) = 30% ( Í 77 um» — 62% +7?) ptr) ar) / (47°), 
0 


3-9. Quadratic regression 
Let us now consider y(t) = a+ Bt y? + e(t). 


From the normal equations we get 


T T 
a= s(or: f ynu- f ED 
-T -T 
^ T 
p= «(f d) jor 
-T 
T T 
2 iss | Py(t) dr- T? J O a) / (87%), 
-T -T 


leading to é(à)-« EÂ =b E0) =y 


(7) 


(8) 


(9) 


(10) 


(11) 


470 Miscellanea 


ud varo) = fao? * BST? 24T*7 — 80742 + GOT? — 573) p(n) «| / (6479, a2 
0 
^ 2T (13) 
var(f) = b f (m - err ptr) dr) [aro 
0 
2T (14) 
var (ĵ) = fise f (1675 — 40747 + 207273 — 375) p(T) ar) [sro 
0 


The above analysis is true whatever the distribut; 


3 ith the 
ion of €(t), but in order to effect a comparison with 
maximum.likelihood solution we shall assume th: 


at the e(¢)’s are normally distributed. 


4. MAXIMUM-LIKELIHOOD SOLUTION 
The method that we have used in this sec 
case of discrete variables, then let the dist; 
limit we have the continuous case. 
"Therefore we use as our model 


-— i r the 
tion is to determine the maximum-likelihood solution AA the 
ance between the variables become infinitesimal so tha 


yr) = a+ Bt, d y et). 


41. Linear regression 
We have the likelihood of obtaining a given sample of size » as 


L= P(e ...e,) 
and if the e, follow a normal distribution, then 


L= Kexp [-( X onea) 20, 
T,S—] 


jn? 
of the ¢,’s and w,, the cofactor of prs 
atters we make our sample of size 2n + 


Fm EB 9. s 0i Ty ney in D. 


We get, therefore, by the theory of maximum likelihood, the 


equations 
?logL _ @logh _ (15) 
au aR D 
to solve. 
n n 
Now since ^re (ULP = 


if P+q is odd, 
T=—ns=—n 
we get as the solutions of (15) the estimates 


a, p given by (16) 
g = (y)/), a 
B= tw), 


where 


n n 
OD di. (1), 


T——n8——mn 


n n t 

È Xowus- (t2t^2), 
r=-ns=-n 

n n 


È ogni (sn), 
r=—ns=—n res" 
N.B. Since Wrs = Wy (from the s i i v: 
3 symmetry of the variance-cova, i the above 
810n8 are symmetrical also, i.e. x Pian etum 


(t^) = (Pta), 


FG) =a, 68) =f 
var (@) = &{(6)?}/(1)2, 
var (P) = flte) (py 


í 


= 


Miscellanea 471 


n n 2 
But we have é((te)*) = e| > X wnt 


r——ns-—n 
= n n n n 
=F à X X X a«6QeosHu6]; 
r——nS——nu-—-—nv-—--—n 
and since 6(6,6,) = Pin 


n n n n 
é((t))2o0? 3j £X X X wot p,. 
T——nSs—-—nWu--—nv--—n 


Further, by the theory of determinants, 


Zante {2 ee 
v w (s—u). 
Therefore é((tc)?) = wo? X Š Orstpt, = qo" (tt^). 
r=-ns=-n 
Hence we finally get varg = m, (18) 
varf = o*o[(t'). a9) 
4:2. Quadratic regression 
We now consider y(t) = a +t + yÈ + e(t). 
The maximum-likelihood equations give us 
| & = ((0*) (y) — (*) (yE?) (1) 7 (0), (20) 
B = (Mt), (21) 
y = (2) (@y) — (0) (AEE?) (1) — (t°)? (22) 
, o leading to é(3)-« EB, éy-v 
| and varg = wo*(t*t’*)/{(t*t’*) (1) — ()*}, (23) 
var f = wo*/(tt’), (24) 
vary = wa*(1)/{(t*t’*) (1) — (¢*)?}. (25) 
5. RESULTS 


In 883 and 4, we have derived the formal expressions for t he least-squares and maximum.likelihccd 
Solutions to the problem of fitting regression curves to auto correlated data. 

By letting the summation expressions in equations (18), (19), (23), (24) and (25) tend to their integral 
imits we get a formal comparison between the two metho ds for continuous variables. In this secticn 
we shall examine the relative efficiencies of the two methods for the case of Markov dependence. This 
Blves us 


pir) = eniti, 
and since m is only a scaling factor we can incorporate it into the sampling range 27' so that 
p(r) = e-!rl. 


5-1. Linear regression 


Substituting into (7) and (8) we get for the least-squares solution 


var (@) = 0°27 — 1-- e-?7)/2T", (26) 
var (Â = 3o3(2? — 3T? -- 3 — 3e-T(1 + T*)/27T*. (27) 
For the maximum-likelihood solution we have 
1 p P pm 
gal (2 1 P cd = (1—p?)2" 


472 Miscellanea 


na Onn = O-n-n = (1—0?) 
s 212n— 
Ory = (Lp!) 0. —gtym3, 
Oeil = —p(1— p2)2n3i, 
Ors =0 otherwise. 


We sample at equal intervals throughout the range 27’, therefore 


$= pech boan 
^ n 


= e-Tin. 
and pP=e 


100F 


Percentage efficiency 


70 


@ in linear regression 

@ |n quadratic regression 

/ In linear/quadratic regression 
Y In quadratic regression 


Fig. 1 
n 
This gives us (1) = È v, = (1—p?)2n-1 {2(1—p)+(1 — p*)* (2 — 1). 
-n 
Therefore lim (1)= lim (2T [n? (14T), 
n>o no 


and similarly for (t^), (#2), ete. So that upon substitution into (18) and (19) we get 


(28) 
varg = o2/(T 4 1), 


(29) 
var = 302/T(72 4. 3T +3). 

Hence by taking the ratios of (26)-( 
estimators of the regression li 


a 
ne as compared to t; 
this measure as a percentage, 


LN 
2 = 
it te 


Miscellanea 473 
5-2. Quadratic regression 


Substituting into (12), (13) and (14) we get the least-squares solution giving 


var(&) = 3o?(6T5 — 31 — 20 T3 + A5T? — 75 + 3e?T(T? + 5T + 5)°}/(8T°), (30) 
var (A) = 3o?(2T? — 3T? - 3 — 3e-?T(T' + 1)°}/(2T°), (31) 
var (F) = 450?(2T5 — 5744 157? — 45 + 5e-?T(T? + 3T -- 32/(ST1), (32) 
and substituting into (23), (24) and (25) we get the maximum-likelihood solution 
var (Z) = 3e (3T? + 15T + 20)/(4(T? 4- 6? 4- 157 + 15)}, (33) 
var (f) = 3e?2/(T(T? -- 3T +3)}, (34) 
var (y) = 450?(1 + T)/(AT*(T? + 6T? + 15T + 15)}. (35) 


Maximum likelihood 
Least squares 


End-point 


Fig. 2 


Once again, the ratios of (30)-(33), (31)-(34) and (32)-(35) give us the relative efficiency of the two 


methods. This ratio is also depicted as a percentage on Fig. 1. 

The time scale of Fig. 1 has been normalized with respect to the time scale of the parameters of the 
Stochastie disturbance so that as plotted it is dimensionless. From Fig. 1 it is seen that the efficieney of 
the least-squares estimators as compared to the maximum -likelihood estimators falls to a minimum, 
at a time T = 1-5 for the coefficients a, J of the linear regression and for f, y of the quadratic regression, 
whilst it falls to a minimum at T = 3-5 for the coefficient « of the quadratic regression. It is also noted 
that the least-squares estimator of the coefficient is more efficient in quadratic regression than in linear 
regression. The minimum efficiency for all cases is seen to be about 69 % for y, showing that the method 
used influences the accuracy of estimation of the higher-order coefficients more than it does the lower ones. 


474 Miscellanea 


š i sion line. This 
As a corollary to this work we compare a third method of estimating the "m Race i in. 
mi is sim) i y taki -points of the sample at hand and measuring t^ 
S ly obtained by taking the two end-points o i n AE oie 
m egies Jotuing them. We can show that for Markov dependence the variance of this estim 
s š 


g*(1— e-?Ty/2T?. 


we 
5 ci ikeli sstimators are O(1/T?), so 

T variances of the least-squares and maximum-likelihood estima iiem. 
Mosel bae multiplying them by 27?/36?, so that what we plot in Fig. 2 for all the three estir 
a A = 27%(variance of estimator) /(302). 

n a s, but for 
From Fig 2 we see that the maximum -likelihood estimator is always better than the others, 
small T' (<4) the end-points estimator is better than the least Squares. 


CONCLUSION T. 
P -— efficien 
As a result of this investigation we have seen how the least-squares estimators of regression = ples the 
compare with maximum-likelihood estimators when autocorrelation is present. For large $ moy of the 
two are asymptotically equivalent, but when the sample is small (as defined in $2) the mes oC for the 
least-squares procedure can vary from 69 96 for the coefficient of the quadratic term y to o 


of least 
estimate of the constant term æ. So that if an efficiency of 70 % can be tolerated the method 
squares is preferable for its facility of application. 


: ish this paper 
The author is indebted to Marconi's Wireless Telegraph Co. Ltd, for permission to publish this p 
and to his colleagues for many helpful discussions. 


REFERENCES 
GRENANDER, U. (1954). Ann. Math. Statist. 25, 252. 


GRENANDER, U. & ROSENBLATT, M. (1954). Proc. Nat. Acad. Sci., Wash., 40, 812. 
Rıce, S. O. (1944). Bell Syst. Tech. J. 23, 282. 


Worn, H. (1950). Trans. Int. Inst. Statist. Berne, 32, 277. 


Bounds for the variance of Kendall's rank correlation statistic 


Bv ALAN STUART 


Research Techniques Unit, London School of Economics 


: lation 
l. Daniels & Kendall (1947) established that the sampling variance of Kendall's rank wees 
Statistic (called ¢ to distinguish it from the population parameter 7 = E(t)) obeys the inequality 


d) 
V(t) <2(1—72)/n = filt). 
They also showed that a class of rankings (called canonical rankings) existed for which 


(2) 
0-83 < V(t)/fi(7) <1. 


As they pointed out, no great sharpenin 
In general, however, fi 
paper, a new bound is 


P e. 

) can be expected in the canonical i: this 

(7) is a very poor upper bound for V(t), as Daniels (1950) has shown. E^ all 
r than f,(7) in rather more than three-quarters grad? 

i FT, Ps), where Pais Spearman's rank correlation nf it 

correlation) for the population. The ratio ST, ps)/f\(7) is Studied, and it is found that for dU 

s. as small as possible the ratio can be as low e ( equal: 

of magnitude i E nd ps 
f; inalouPie s Ed. of. gnitude in n smaller than Ji. When r and p. 


á i . f cases 
2. We restrict the discussion to samples from a continuous bivariate population. In this 
Hoeffding (1947) has shown that 


(3) 
V(t) = 8{p(1—p) + 2(n—2) (k—p)}fn(n— 1» 


N 


Miscellanea 415 


exactly, where p = }(1+7) is the probability of concordance of a pair of bivariate observations, and kis 
the probability that, among three bivariate observations, a specified pair is concordant with each of 


the others. 

We now define g as the probability that at least one of the three pairs of bivariate observations is con- 
cordant with cach of the others, and d as the probability of complete concordance of the three pairs of 
Observations. Then, using the formula for the probability of realization of at least one among compatible 


events, wi re 
nts, we have g = 3k— 2d. (4) 


Hoeffding (1948) has shown that 
g = M1 +p.) (5) 
where p, is the population grade correlation, the analogue of Spearman's rank correlation statistic 7s. 
Using (4) and (5) in (3), we find 


V(t) E (1—724- 8(n — 2) [21 +p, - 4d) — 3(1 4 7) (6) 
n(n— 1) 


) which is simpler to caleulate than the equi- 


An unbiased estimator of V(t) may be constructed from (6 
). For n at all large, it is the 


valent form based on (3) or that suggested by Daniels & Kendall (1947 
" n * 
tediousness of. estimating d in (6) or k in (3), each of which requires examination of all (5 triplets of 


Observations in the sample, which leads us to seek a bound for V (t) instead. 
3. Mann (1945) gives a simple proof that 
Prob (X, < X, < X) < Prob (X, < Xa} Prob (X, < Xj). (7) 


Applied here, (7) gives at once 
d«p! = 4(1+7)", "m 


and use of (8) in (6) gives us our new bound 


V(t)< son —1) (1— 72) + 4(n — 2) (p, — 7)) = fs(T Ps). (9) 
3 n(n— 1) 
Since 7? = E(e)— Vi) 


and (Hoeffding, 1948) 
us E(t) 27, E(r) = (9r (n—2) pg (n 2), 


^ 
We find from (9) a bound for an unbiased estimator, V(t), of V(¢), which is 


P(t) < Ce ((2n — Di- 12) +4(n+ 1) (rs —0))- (9a) 


i " . 
The inequality in (9) becomes a strict equality only when that in (8) does, i.e. when T = p = +1 or — Ls 
In this degenerate case V(t) = f(T Ps) = 0, from (9). 


4. The difference between the two bounds is 


ED 
A(T) —falT Ps) = a —Ti—4(p,— T)» (10) 


and is of order n~t, just as the bounds themselves are. From (10) we deduce that 


AC) >fo(TsPs)> 


if and only if ps € A0 47— 77). (11) 
If7 = Ps =+lor =—1, the bounds are both zero, as is the variance. 
5. We now study the behaviour of the ratio of the bounds in the interval —1<7< 1, namely, 
9n—1  4(n—2)(p.— T) (12) 


felt, pif) = 3m—1) 32-1) —7) 


476 Miscellanea 
The only part of (12) we need consider is 
(p,—7) (13) 
F(7,p,) = pau 


; : š st 
We approach the problem by considering p, as a function of T, and asking what form this function mu: 
take to give F(T, p,) stationary values. Differentiating with respect to 7, we have 


F(T, p.) = (pi— 1) (177)  2r(p,— 7) 


(1—7?)2 3 
and this is zero when 
pis 270,— (1+7?) (14) 
a. i SS, (hae . 
jin as 
This may be solved as a differential equation in 7, and using the initial condition p4(1) = 1, we obtain 
the unique solution on (15) 
= 
Substitution of (15) into (12) gives (16) 
Fit. TWAT) = 42n — 1)/(n — 1) 


independent of the value of 7. From (12) and (16), 
positive or negative with (Ps—7). Thus (15) repres 
no turning point. 


vill be 
it follows that f,(7, p,)/f,(7) — 4(2n—1)/(n— pere $ 
ents a ‘shoulder’ in the surface of F(T, p,). The 


6. For fixed 7, (12) is a monotone incre: 


is 
asing function of p,. It is thus as small as possible when Ps? 
a minimum. Daniels ( 1950) and Durbin 


& Stuart (1951) have established the sharp bounds for ps: 
237-1) <p, < (1427-72) for imi (17) 
UT? 4+ 27— 1)<p,<4(387+ 1) for r«0. 

Substituting into (12) the lower bounds for p, in (17), we obtain 


FAT, 3(87 — I)}/fy(r) = in uis (1>7>0), 


(18) 
. 1 
hir he ay e (-1«7«0) 
ee e 
The ratios in (18) range from 3 1)/(n—1) near 7 = I to 1/(n—1) for 7&0. It is remarkable that W 
have from (9) the bound 
fitr, U7? -2r— 13 = 2 (1-72) (—1 x0 gh 
n(n—1) SH); 
of order n-?. (This is not a new type of result; Hoeffding ((1948), p. 318) gave an example where 
Vi) = 2? 


In (20) 7>0 applies to the inverse canonical 


j ranking, 
Substitution of (20) into (12) gives 


" :eal ranking: 
while 7 « 0 applies to the direct canonical rank! 


L+[7|\2 4(n—2) (1—t30 78 (2) 
p pas (En n i B-r}? 1i 
2 3 f(T) laac) lxv Al 
The expression in braces on the right of (21) is 
1- [4 7j » 
— der cde (-1«r«1), 


so that (21) is > 1, and therefore fı always provides a better bound than Jo in the canonical case. 


| 
| 
| 


Miscellanea 477 


8. The diagram illustrates our results. The fulllines delimit the possible combinations of the coefficients: 
fs is a better bound than f, for all points below the dashed line. The dotted line, representing the canonical 
situations, lies entirely above the dashed line. Integrations show that f; is a better bound than f, in H. 
or 78:6 95, of the possible situations. 


REFERENCES 


Danters, H. E. (1950). Rank correlation and population models. J. R. Statist. Soc. B, 12, 171. 

DaxrErs, H. E. & KENDALL, M. G. (1947). The significance of rank correlations when parental corre- 
lation exists. Biometrika, 34, 197. . B . 

Durs, J. & Sruart, A. (1951). Inversions and rank correlation coefficients. J. R. Statist. Soc. B, 
13, 303. 

Horrrpra, t 
are not independent. Biometrika, 34, 183. . . 

Horrrprne, Wassiny (1948). A class of statisties with asymptotically normal distribution. Ann. 


Math. Statist. 19, 293. . 4 
Mann, Henry B. (1945). Non-parametric tests against trend. Econometrica, 13, 245. 


Wassrry (1947). On the distribution of the rank correlation coefficient 7 when the variates 


478 Miscellanea 


A note on the theory of quick tests 


By D. R. COX 
Department of Biostatistics, School of Public Health, University of North Carolina 


1. There has been a good deal of interest recently in quick approximate methods of examining statis- 
tical significance. The object of this paper is to make some general comments on the interpretation and 
justification of such methods. An example of a quick test is the sign test for a normal mean; given ? 
random sample of size n from a normal population of unknown mean 1t, the hypothesis that x = 01s 
tested by referring the number of positive observations to the binomial distribution with parameter 
4 and index n. The efficient test, the t test, is here so simple to work out that it is not very often that 
a quick test is required to replace it, the real value of quick tests being in more complicated situations, 
where the fully efficient test may be difficult to apply. Thus in a problem where the efficient analysis calls 
for say the solution of a complex non-orthogonal set of least-squares equations, the use of an alternative 


‘inefficient’ procedure may be very profitable. However, it is convenient to use simple cases for 
illustration. 


2. We consider in this paper a very special situation. Let it be required to test the null hypothesis 
0 = 0, concerning the single unknown parameter 9, and let the theoretical set-up be sufficiently simp ie 
for the best significance test, based on a statistic / say, to be uniquely determined and let the propose 
quick test be based on the statistic, q. Ass : 
with the large-sample approximation in which t and q follow a bivariate normal frequency distribution 
with a variance-covariance matrix which c > 
are g? and o*/H, Fisher (1925) showed th. 
the efficiency of g relative to t. 

The general remarks below ap; 
situations just described. 


3. The customary method of comparing the 
we plot against 0 the probability of attaining 


ply quite generally but the quantitative analysis is restricted to the 


6. For given 0, q it follows from the i 
Nn n 0, £ Properties of the bivari 
distributed with mean 06(1— E) [9 - qE]o- md variance 1 hy 
and while we could, for some P $ 
i urposes, consider th joi 
unknown parameter O,itisana dons ONU Ant uf 


Miscellanea 479 


normal deviate based on q for testing the null hypothesis 0 = 0, is g^ = (q — 4) E/F, the corresponding 
normal deviate, t^ = (t— 6,)/c has, given g’, mean q’/,/E and variance (1 — E)/E. 


7. The distribution of t’ given g’ just derived depends for its frequency interpretation on the truth of 
Bayes's hypothesis as a statement about the frequencies of values of Ü in a long series of applications of 
the test. It isnot usual to base a theory of statistical inference on Bayes's hypothesis without abandoning 
the frequency interpretation of probability, but in the present paper we are merely exploring in a general 
way the properties of quick tests. It is perfectly reasonable to ask how the quick test would behave if 
Ü were distributed in some special way, and if a special form is to be chosen the uniform distribution 
Seems a natural one. 

The distribution of / given g’ in the range of most practical interest depends on the form of the prior 
distribution in the interval, say | 0-0 | < 6c; in large samples ¢ will be small and so this will represent 
a narrow interval for Ó over which the prior probability is, in some types of application, likely to vary 
little. If a prior distribution for Ó exists and is not constant over this range the results will be affected. 
For example, if the prior probability density is higher at the centre of the interval, the value of | ¢’ | 
given g’ will be reduced. 


8. A few numerical results derived from the formulae of $ 6 are given in Table 1. Some examples will 
now be given to illustrate this table. 


Table 1. Relation between standardized deviates t', q’ when 0 has a special prior distribution 


— 
E Mean of t Standard error 
given q' of t' given q' 
0-99 1:005 0-101 
0-95 1:026 0-229 
0-90 1-054 0-333 
0-80 1:118 0-500 
0-70 1:195 0-655 
0-60 1-291 0-816 
0-50 1:414 1:000 
0-40 1:581 1:225 
0-30 1:826 1-528 
0-20 2.236 2-000 
0-10 3-162 3-000 


Example 1. Suppose that a test with E = 0-99 gives a value just significant at 596, i.e. g^ = 1-645. 
Then // has mean 1:653 and standard error 0-101. Therefore ¢’ probably lies between 


1:653 + 0-101 = (1-552, 1-754) 
and fairly certainly lies between 
1:653 + 1-96 x 0:101 = (1:455, 1-851). 

The significance levels corresponding to the two sets of limits are approximately (0-060,0-040) and 


(0-073, 0-032). 

As would $a expected with so high an efficiency, serious disagreement between the results of the tests 
is unlikely. 

Example 2. Suppose that a test with E = 0-80 gives satisfactory agreement with the null hypothesis, 
say g’ = 4. Then /' has mean 0-559 and standard error 4. The two sets of limits for t' are therefore (0-059, 
1-059) and ( — 0-421, 1-539). Therefore it is unlikely that the level of significance with 7' is higher than 


about 6 96. 
For many purposes the efficiency would be considered quite high, and yet appreciable disagreement 


between the two tests will arise quite often. 

Example 3. Consider a very inefficient test with E — 0-2 and suppose that it gives exact agreement 
with the null hypothesis, i.e. g^ = 0. Then since the standard error of ¢’ given q' is 2, there is an appreciable 
chance that t’ would have given significance at a high level. If, on the other hand, g’ = 2, indicating 
significance at the 24% level, t^ has mean 4-472 and standard error 2, so that it is rather unlikely that 


ť fails to indicate high significance. 


480 Miscellanea. 


3 E "Ne ; * 
This example expresses quantitatively the obvious point that failure to get significance shee | 
inefficient test gives no information about whether significance would be obtained with an effici 
but that if the quick test gives Significance it is likely that the efficient one will too: E i 
It must again be stressed that these are illustrations of the relation between ¢’ and q for 
prior distribution of 0. 


t 

9. The estimation formulae analogous to the above results for significance tests are as d A 
q be an estimate of 0 with standard error c,and efficiency E, and let t be the efficient ests e m LB), 
same sample. Then if 0 has a uniform prior distribution, (£— q) has mean 0 and standard error a Aoclably 
thus enabling us to assess from the quick estimate whether our conclusions are likely to be app: 
modified by computing the efficient estimate. This will not be diseussed further here. m 

10. Two general points need to be made about the above calculations. First, it is in practice mone aa 
possible to estimate approximately the probable direction and nature of the difference oo weight 
by careful inspection of the observations, the main difference between ¢ and q often being in d s ibo 
they attach to extreme observations. Secondly, all the above work is based on what happens w eike 
observations follow the distribution law assumed in deriving the tests; the relative robustness 
two tests, which is relevant in choosing between them, has not been considered. 


= 1:79, 
: : acant) 
which is significant at 10 % in a two-sided test. (If we correct for continuity the result is less significa! 
Since for this test E = 2/m (Cochran, 1937), t' has me; 

i.e. is 2-24 + 0-60, so that 1 


unaltered. However, we would have note 


H H X cte 
d this large negative deviation and would not have expe 
t to give a more Significant answer than t 


he quick test. 


Tam grateful to Prof. E. S. Pearson for very helpful eriticism, 


REFERENCES 


: i s 
A note on the signs of gross correlation coefficients and Partial correlation coefficient 


By OLAV REIERSOL 
University of Oslo, Norway 


We shall use the following matrix theorem: 


Soda i A 
. THEOREM 1. 1f the principal minor determinants of a square matriz A and the determinant valve of t 
itself are all positive, and Vf the non-diagonal elements of A are all negative, then all elements of the adj 

of A are positive. $ 


- a dtl: LL a 
^ > = 


Miscellanea 481 


This theorem was proved in a slightly different form by Mosak (1944, pp. 49-51). The theorem was 
stated in its present form by Metzler (1950, p. 340). 

Remark. Theorem 1 still remains true if ‘adjoint’ is replaced by ‘inverse’. 

Tt is evident from the proof of Metzler that Theorem 1 may be sharpened to 

THEOREM I’. If the principal minor determinants of a square matrix A are all non-negative, and if the 


non-diagonal elements of A are all negative, then all non-diagonal elements of the adjoint of A are positive. 
Let 


l Trag - fus 
PEEL "M 
Tui Tne l 


be the correlation matrix of a set of variables Tis Tas ..., Ly. The partial correlation coefficients of highest 
order of the set are given by 
= cof. (ri) (1) 
Pa =— Roof. (ra) cof. ry} 


where cof. (7r;;) is the cofactor of r;; in the matrix R. We sce that the sign of a partial correlation coefficient 
is the opposite of the sign of the corresponding element of the adjoint of R. Applying Theorem 1’ we 
thus get: 


THEOREM 2. If all gross correlation coefficients of a set of variables are negative, all partial correlation 
coefficients of all orders of the set are also negative. 

We shall next consider the case when all partial correlation coefficients of highest order are positive. 
Then all non-diagonal elements of the adjoint of R are negative. 

Suppose first that R is singular. Then the rows of the adjoint of R are proportional. Since the diagonal 
elements of the adjoint of R are all positive, the non-diagonal elements cannot all be negative when 
n23. We conclude that R is non-singular when all partial correlation coefficients of highest order are 
positive. The inverse matrix Q = R- thus exists, and its non-diagonal elements are all negative. Applying 
Theorem 1 we conclude that the elements of R = Q~ are all positive. 

Let R and Q be partitioned symmetrically into submatrices in the same way 


R= Ry x and Q= Qu Ae J 
Ra Rog Qu Qae 


Rī = Qu — Q12 Qi? Qu. 


Using the information we have about the signs of the submatrices of Q and applying Theorem 1 to the 
matrix Qs, we conclude that the non-diagonal elements of Rj} are all negative. Now any partial correla- 
tion coefficient of any order of the sot x,, ta, ..., 2, has opposite sign of a non-diagonal element of some 
Rj} of the original correlation matrix or of a matrix which we obtain from R by an interchange of rows 
and the corresponding interchange of columns, hence we obtain: Á 


Then (Hotelling, 1943, p. 4) 


THEOREM 3. If all partial correlation coefficients of highest order of a set of variables are positive, all 
partial correlation coefficients of lower orders are also positive, and all gross correlation coefficients are positive. 

If we change the signs of all elements of one or more of the rows of a square matrix A and afterwards 
change the signs of all elements of the corresponding columns, we have performed a particular kind of . 
cogredient (also called congruent) transformation of the matrix A. Moreover, this is the only cogredient 
transformation whose only effect is to change signs. We shall therefore call it a cogredient change of signs 
of the matrix A. We shall adopt the convention that no change of signs is also a cogredient change of 
signs. 

A cogredient change of signs of the matrix R will also be called a cogredient change of signs of the 
gross correlation coefficients of the set z,...,c,. If P is the matrix whose elements are defined by (1), 
a cogredient change of signs of P will also be called a cogredient change of signs of the partial correlation 
coefficients of highest order of tho set of variables [UN m 

The following generalization of Theorems 2 and 3 is easily proved by performing a change of signs of 
a subset of the set of variables vj, ...,2,: 


THEOREM 4. If the signs of all gross correlation coefficients of a set of variables may be made negative by 
& cogredient change of signs, or if the signs of all partial correlation coefficients of highest order of the set may 
be made positive by a cogredient change of signs, then, without any change of signs, any partial correlation 
coefficient of any order has the same sign as the corresponding gross correlation coefficient. 


31 Biom. 43 


482 Miscellanea. 


REFERENCES 


HorELuIxG, H. (1943). Some new methods in matrix calculation. Ann. Math. Statist. en 
METALEN: L. A. (1950). A multiple-region theory of income and trade. Beonomatrica, 18, 328 wed 
Mosi J L. (1944). General Equilibrium Theory of International Trade. Cowles Commission ? 

j graph no. 7, Bloomington, Indiana. 


The estimation of the mean of a censored normal distribution by ordered variables 


By P. G. MOORE 
University College, London 


" " i has 
1. In many practical problems the final data obtained are incomplete in that time or equipment 
not been available for completion of the ori 


ginally planned experiment or that something has em 
wrong, leaving some observations missing. We suppose that the observations form a zadom MT 
from a normal population, but that the sample is in some way censored, meaning that all observ: i 
above some limit, or below another limit or outside both limits, are unavailable to us for the eg ape 
estimating the mean. The only information available to us about the missing observations is 
total number. 


` 


2. One method of approach is to consider the m. 
appropriate auxiliary tables have been giv 
samples the two solutions differ. Hald su 
values of the variable, x, which fall on one side 


ris 
--&, are known but not the values of z,,, ... Ss. e is 
fixed. The asymptotic variance of both estimators is, however, the same, and it is this variance whi 


3. For computations use will be made of the ex: 
variables in a random sample of n individuals 
function is f(x) with mean E 


ed. 
pansions given by David & Johnson (1954) for orde? 


EAS ity 
drawn from a population whose probability den? 
and standard deviation unity. We define 


X 1 
FX) = [^ feas, o 


and arrange the sam 


ple observations in order of. magnitude to give values Bis 
the relation 


$,,...,0,. Defining X, bY 


FZ) = = 
(X,) n+ (r= 1,2, an) 
and expanding z, about X, in an inverse Taylor series gives 
2) 
t= XX) + 3X7U()g + Ue . ; 


where 


as examined for a few cases of sampling from a norma] population. For i s 
using (2), the variance of the median is 
s : y 3) 
zaleje 1-570796 , 2467401 3-462732 ( 


n+2 | (n42 mF” 


Miscellanea 483 


and numerical values derived from this formula are compared below with exact values for low sample 
sizes given by Hojo (1931): 


Median in sample of n n=5 n=7 n=9 n=11 
Exact variance 0-286834 0-210446 0-166093 0-137227 
Approximate variance 0-284849 0-209744 0-165793 0-137164 


For samples up to size 10 the variances and covariances of all ordered variables have been given by 
Godwin (1949). As an illustration, we compare two cases for samples of n = 10 using terms in (2) up to 
(n.4- 2)-?. From these examples it seems that the expansions would give sufficient accuracy for practical 
purposes provided n was greater than about 10. 


var (23) covar (37) 
Exact value 0-17500 0-06302 
Approximate value 0-17485 0-06297 


4. If we use a pair of the ordered variates to estimate the mean, there are two cases to be considered, 
as we may either use two symmetrically placed variates, that is, x, and z, 44, or we may alternatively 
use an unsymmetrical pair. Fora one-sided censored form of distribution more observations are available 
to us on one side of the central value than on the other, and to make use of a symmetrical pair means 
that a number of observations may appear to be needlessly wasted. Considering the symmetrical case 
first we find from the expressions in (2) that for large samples the estimate 


Va = iva) (4) 


(5) 


has as variance the approximate value variance (V2) 4 


where p, = F(X,) =17/(n+1). 


We find, on substitution, that the asymptotic efficiency of V, as compared with that of the mean of the 
complete sample is 22?/p,, where z, is the ordinate of the standardized normal distribution corresponding 
to a probability integral of p,. This function is tabulated in Table 1 together with the asymptotie maxi- 
mum-likelihood efficiencies from Gupta’s work based on the assumption that the distribution is censored 
at one end only. From the table we see that the efficiency of V, increases with p, to a maximum at about 
p, equal to 0-27 and then decreases to the value 0-637 which occurs when just the median is used for the 
estimation. Pearson showed that the maximum efficiency occurs when p, is equal to 0-271, and the table 
shows that if the censoring at either end is greater than 27-1 % we should take the symmetrical pair 
that lies farthest out from the centre of the distribution, whilst if it is less we should take the pair that 
makes 7/(n + 1) as near 0-27 as possible. E 


Table 1. Efficiencies of estimates using a pair of quantiles 


Maximum- Maximum- Maximum- 

Pr Efficiency | likelihood Pr Efficiency | likelihood Pr Efficiency | likelihood 

of V, efficiency of V, efficiency of V, efficiency 

. ~ | 

0-10 0-616 | 0-980 0-24 0-805 | 0-920 0-38 0-763 0-806 
0-12 0:667 | 0974 0-26 0-809 | 0-907 0-40 0-746 0184 
0-14 0:708 0-967 0-28 0-809 0-893 0-42 0-728 0-768 
0-16 0:740 0-959 0-30 0-806 0-878 0-44 0-707 0-738 
0-18 0-765 0-950 0:32 0-799 0-861 0-46 0-685 0714 
0-20 0-784 | 0-941 0-34 0-790 0-845 0-48 0-661 0-685 
0-22 0-797 0-930 0-36 0-778 0-826 0-50 0-637 0-659 


e to improve the estimates by using an unsymmetrical pair of 


Next ider whether it is possibl ; 
e. i ees eighted average of the pair. Let the two variates be a, and v, 


quantiles. In this case we have to use a w 


and consider s r 
sof -2P xi 
fe o NW N N Gay) (6) 


31-2 


484 Miscellanea 


where ®(V) is the deviate of the standardized normal curve 


corresponding to a cumulative probability 
of V. Then if the unknown underlying normal distribution h: 


as mean and standard deviation g, we have | 
e[*— ~ el” ef s zn 
c n+l c n+l 
and hence on substitution in (6) and reduction, we have 


eÊ =E. 


Further, as an approximation to the variance, we find 


(n-+2) var (Pq) eoe P722 i up p) 


p,(1—p) (7) 


, 


z —-2a(1—«) 
" z, 


8 8 r 
was " (3) / (753) (2). 


Using (6) and (7), the optimum values ofp,h 
the new efficiencies. Although there is some 


z, 
r?s5 


ith 
ave been found for four different values of p, together Wl 
gain, the gain is not in general very large. 


Table 2. Efficiencies of V, 


uao 


Symmetrical Ds 
Efficiency 

, Unsymmetrical Ds 
Efficiency 


5. To improve the e 


fficiency of the estimate four ordered variates 
Thus the estimator 


d. 
placed in two pairs could be use 
Vax dTo(z, tn) + 1(1—2) (z, a 


41-9) 


Table 3. Efficiencies of V, 


0:25 

= 0-42 0-42 

VAN 0-542 0-615 

ciency « :872 
Maximum-likelihood efficiency em 9313 | 


Table 4. Efficiencies of 


V, with p, = 0-5 
i 0-1 0-2 Mi 
0-3 
a 0-51 0-38 0-25 gie 
ee | 

Efficiency 0-86 50 

zih! “866 t b 0 
Maximum-likelihood efficiency 0-980 0541 dee 


784 
0-878 0-78 


Miscellanea 485 


and « for various values of p, together with the efficiency thereby achieved and the corresponding 
maximum-likelihood efficieney when a portion p, is censored. In Table 4 we take the case where p, is 
equal to 0-5, i.e. we use the median and two symmetrically placed variates. The optimum efficiencies 
are now somewhat greater than when just two variates were used and are approaching the maximum- 
likelihood values. 


6. The estimates could be improved by using more variates but would then become rather unwieldy. 
The results obtained would seem to show that if there is a fair amount of censoring, say over 20% of 
the distribution, then the simple estimates based on ordered variables may suffer very little in efficiency 
when compared with a maximum-likelihood estimate of the mean. On the other hand, when there is 
very little censoring, say under 10%, there is a considerable gain in using the maximum-likelihood 
methods of estimation. In a situation which involves censoring at both ends of the distribution, the 
Ordered variates would be even better vis-à-vis the one-ended maximum-likelihood estimate, but such 
cases would occur rarely in practice. 


Later note. The recent publication of a paper by D. Teichroew (1956, Ann. Math. Statist. 27, 410) 
enables the exact efficiency of V, to be calculated and compared with the approximate probabilities 
obtained as in Table 1 for the case of a sample of size 20. 


Efficiency of V, 
, Pr 
Exact Approximate 
=| | 2 

4 0-1905 0-780 0-775 
5 0-2381 0-816 0-805 
6 0-2857 0-824 0-808 
7 0-3333 0-810 0-796 
8 0:3810 0-779 0-763 


Thus the approximate formula slightly underestimates the efficiencies for a sample of size 20 and V, 
is in fact slightly better, as compared with the maximum-likelihood estimator, than it appeared to 
be from Table 1. 
REFERENCES 

Davin, F. N. & Jonnson, N. L. (1954). Biometrika, 41, 228. 

Gopwiy, H. J. (1949). Ann. Math. Statist. 20, 279. 

Gupta, A. K. (1952). Biometrika, 39, 260. 

Harp, A. (1949). Skand. AktuarTidskr. 33, 119. 

Hozo, T. (1931). Biometrika, 23, 315. 

Pearson, K. (1920). Biometrika, 13, 113. 


A note on Wilcoxon's and allied tests 


By F. N. DAVID 
University College, London 


l. It is assumed that there aro available two samples nı and n, randomly and independently drawn 
from each of two populations supposedly identical under the null hypothesis. The n, 4-71 joint sample 
values are ranked in ascending order of magnitude thus forming a random sequence of two alternatives 
in which order must be taken into aecount. Mann & Whitney's adaptation of Wilcoxon’s test consists, 
in essence, of taking the sum of the ranks of one of the alternatives. This criterion is used to test for the 
equivalence of the parameters of location of the two populations. Rosenbaum (1953) and Kamat ( 1956) 
have also discussed criteria based on the random sequence which may be used, if desired, to investigate 

31-3 


486 Miscellanea 


i i i ent 
the equivalence of the parameters of dispersion of the two populations. It is the purpose of Her ue 
note to suggest a method of approach to tests on the random sequence whereby symmetric u ee 
of the ranked observations may be used. Since such an approach is based on approximating ph 
distribution of a criterion by means of its moments, it will only be useful for sequences of mo 
length and upwards. ; 

idere 

2. The random sequence of length n, +, = N, in which we take account of order, can be been id ly 
as a finite population of the first N integers. The ranks of the first sample will be n, elements ran d 
drawn from this finite population without replacement. The moments of any symmetric Aper Ex 
first sample can therefore be written down immediately either from Isserlis’s results (1931) or Lie vill 
the procedure devised by Irwin & Kendall (1944) or from Wishart's tables (1952). Such momen: mdi 
be in terms of the K-statisties of the first N integers which can themselves be obtained from the mom 

s Ji T a 

of the first N integers. Thus i ehe 


N+l | — NNa) o. N*N4 1)? 
Hi Ms Sa XK == 120 ^" 
I _ NAN 41) N*(N 4-1) 

5 252 504 ^" 
NY (N40)? — 
Ky = USE eet cry anv 4.1) 49), 


and so on. 


3. The mean of the ranks of one sample. Abdel-Aty (19 
M(1") = &y(ky—K,yr, 


54) has shown that if we write 


ar-2 ar-3 a 
zm ue ccs — — -(— ed 
A, = ar +r e(l Eo 
l 1 
a=, 
n N 


where N is the number in the finite population and z that in the 


M(1)=0, M12) = KjA, M(13) = K,A 
Let tyty ss Uns 
Then we have 


sample, then 

» M(1) = K,A,+3Ky.A}. 

be the ranks of the individuals of the sample of Nys 
N+1 


mē) = m 


ks. 
and z, the mean of these 74 
nN +1) 


> dq) = 12n, - > HT) = 0, 


Kz) = Nt) EZ 1 .2N 
ni ny 


The distribution of z, will tend to normality as N increases Provided n,/N remains fixed. To test therefo 
for equivalence of the population parameters of location with the alternative hypothesis that they ™ 
be different we would take the test criterion 
E Cz | 
2 | 
12n, m 
E bu 5 TAN tables. This is Mann & Whitney’s procedure for N reasonably large and pelt? A 
4. The variance of a sample of », individuals drawn from a finite population of N we define i 
a RTT 
hdd è 
The moments of k, have already been given in the form of multiple subscript K’s by Wishart. The 
M(2) = 0, 
M(2?) = eru (Ks — K2), 


Miscellanea 487 


with longer expressions for Mi (23) and M(2) which we do not reproduce here. Irwin & Kendall give the 
Second moment as 

n4Nn, —N—n—1),. n 2na 
n((,—1)N(N 1) * (n, C1) (N +1) 


To test whether there is a difference in dispersion one criterion could be to take the ratio k,/ K, or, since 
Ky is a constant, just Ey. The momental constants of the ratio for sequences of 20 and 30 are given in 
Table 1. The standardized percentage points are obtained by interpolation from Table 42 of Pearson & 
Hartley (1954). Forn and x, very different it is clear that an assumption of normality for the distribution 
of a sum of squares will be inadequate unless the sequence is of length at least greater than 30. 


K 


1219 


M(2?) = Ex(ky—K,)? 


Table 1. Momental constants of k,/ K; for various values of N and n, 


23% points 
| | 

N n M(22/K$ |(M(22)K8) NA | A; Bs | 

Upper | Lower 
1 = | 
20 10 | 00522 0:2284 0-023 0-0005* 2-746 1-95 — 1-93 
12 | 00334 | 0-1828 —0:047 | 0-002 2-749 1:92 | 1:96 
14 0-0200 | 0-1445 —0-122 0-015- 2-730 Ls — 2-00 
16 0-0119 0-1092 — 0-219 0-048 2-674 179 | -204 
= — | i 

30 15 0-0319 0-1786 0-008 | 0-000 2-843 1-95 — 1-94 
18 0-0207 0-1439 —0-045 | 0-002 2-842 1:92 —1:97 
21 0-0131 | 0-1142 —0-103 | 0-011 2-828 1:89 — 2-00 
94 0:0075* 0-0866 — 0-180 | 0-032 2-788 1-885 — 2-03 


Mood (1954) suggests as a rank criterion to test dispersion, the sum of squares of the deviations of 
one sample from the population mean, }(N +1). This is 


m 
X (v, - Kj)? = (n — 1) kj +n (Ky — K3)2, 
i= 
the moments of which can be written down immediately from Wishart’s tables. Which criterion is the 
more sensitive to changes in the dispersions of the two populations generating the samples it does not 
appear as yet possible to say. It is also difficult to see whether a criterion based on the sums of squares 


of both samples, e.g. kall) + ka(2) 


will be an improvement on a criterion based on a single sample, but probably little sensitivity will be 
gained if this is tried. Whatever criterion is used it will be possible from Wishart's tables to write down 
its moments either exactly or to any order of approximation desired and to approximate to its probability 
distribution (for sequences of reasonable length) by means of a smooth curve. ae 

Tests based on symmetric functions of all the ranks of a single sample should be more discriminating 
than tests based on ordered variables, such as range for example. One point should, however, be men- 
tioned about all ranked tests for dispersion differences. It may be that at the same time as there isa 
change in the dispersion parameters of the populations generating the sample, there is also a change in 
the location parameters which will effectively mask it. Tt does not seem possible to devise a single eriterion 
based on ranks which would take this into account. 


5. Correction for ties. Corrections for ties have been discussed recently by Putter (1955) who gives 
a list of references. Provided the sequence is long enough and the ties not too frequent a straightforward 
and easy allowance can be made. Suppose we have v elements which tie at rank r. The usual method of 
assigning ranks is to give each of the v elements the same rank, 7 + $(v— 1). Ky is then unaltered but all 
the other K-statistics of the population are changed. We can calculate the K -statistics of this changed 
Population and use these values in the moments of whatever criterion we are interested in. For n, and 
^, large and for the number of sets of ties not too small the effect appears negligible, 


488 Miscellanea 


REFERENCES 


ABDEL-ATY, S. H. (1954). Biometrika, 41, 253. 

Inwrw, J. O. & KENDALL, M. G. (1944). Ann. Eugen., Lond., 12, 138. 

IssERLIS, L. (1931). Proc. Roy. Soc. A, 132, 586. 

Kamat, A. R. (1956). Biometrika, (in the Press). 

Moon, A. M. (1954). Ann. Math. Statist. 25, 514. " 

Pearson, E. S. & HanTLEY, H. O. (1954). Biometrika Tables for Statisticians, 1. 
Cambridge University Press. 

PUTTER, J. (1955). Ann. Math. Statist. 26, 268. 

ROSENBAUM, S. (1953). Ann. Math. Statist, 24, 663. 

Wisnanr, J. (1952). Biometrika, 39, 1. 


Likelihood function for capture-recapture samples 


By N. E. G. GILBERT 
John Innes Horticultural Institution 


When capture-recapture methods are used on a will 
so that the distribution of marked and unmark 
binomial. However, samples of insects have som 


s, atom AR RIGOR 

ll, of course, bemore complicated than this in practic 

* X(z/X —n|N): 

2= 5 »- 

a OB —2]X)(1—ajn)’ 

er all samples in the chai 
n/N in the denominator, 
btainable from the other 


where the summation extends ov 
but since z/X is used instead of 7 
where information about N is o 
in the numerical example. 


: j T re 
A better weighting of the contributions from the different samples is obtained by minimizing the e 
usual expression for y? with denomina. 


s ing 
tor (n/N) (1 —n[N) (1 — X|N), but the solution of the result 
equation involves heavy arithmetic. 

Using the binomial approximation, the likelihood function is 


log L = Z {vlog (n/N) 4- (X —) log(1—n/N)}, 


M ient; 
n. These estimates are asymptotically ane 
they can be improved upon in the presen? ate 
samples in the chain. This point is illustr 


so that dog L) 2) =z nX — Na “Ng : 
dN N(N —n) 


"This expression may also be obtained by differentiating the numerator only of 
(Nr —nX) 
P ES ee 
p Na(N—n) wrt. N. 
In the hypergeometric case, each term in t 


his expression for y2 į di 1 — iN) The 
Xpression for y? is divi = A 
maximum -likelihood equation, when simil. S. d. ARA Bra Eerie : 


arly modified, becomes 
d(log L) =£ nX —Nax 


number marked is now eleven " 


(With zero births and deaths, the distribution of 


Miscellanea 489 


This provides a simple numerical illustration; the two sampling fractions are different but not too much - 
so (i.e. the estimation is not dominated by the information from one sample), and the two estimates of 
N from the individual samples differ (but not significantly). 

The estimates are 


Chapman's minimum 3? N = 20-4 


Binomial &-ncsiem 

Modified binomial N = 23-1 + 3-68 

Hypergeometric N = 23-243-54 
REFERENCE 


Cuapman, D. G. (1951). Univ. Calif. Publ. Statist. 1, 131. 


Tables of Poisson power moments 


By J. B. DOUGLAS 
School of Mathematics, N.S.W. University of Technology, P.O. Box 1, Kensington, New South Wales 


In the fitting of the complete and truncated Neyman Type A contagious distribution by maximum- 
likelihood methods, the rather tedious arithmetic can be considerably reduced by useof tablesof moments 
about the origin of the Poisson distribution. Appropriate tables, to 3 or 4 (and occasionally 5) significant 
digits, with a description of their method of use, were published in Biometrics, 11 (1955), 149-73. 

The tables are in fact of ratios of moments. If 


E 
Jis =e” X rzAt[r!, 
r=0 


then Pa = iati 
and = = 
are tabulated for Ge = Pel Peri — Pe) 


æ = 0(1)19, 
À = 0-000 (0-001) 0-03 (0-01) 0-3 (0-1) 3-0. 
However, these tables also exist in manuscript form with values of p, calculated correct to 6 or 7 


significant digits (and corresponding values of q,), the ranges of tabulation being those given above. 


The existence of these more detailed tables is therefore recorded. here; the author would be glad to learn 
of any applications where their use would be desirable. 


[ 490 ] 


REVIEWS 


" ak 
Theory of Games and Statistical Decisions. By D. BLACKWELL and M. A. CETEROS 
New York: John Wiley and Sons, Inc.; London: Chapman and Hall, Ltd. 1954. 
Pp. xi+ 355. 60s. 


Readers of this journal will have seen Barnard’s detailed review of A. Wald's Statistical Decision 


s rarious 
Functions (Biometrika, 1953, 40, 475-7). The present book shows how statistical problems of vario 


types can be handled via games theory and decision functions. It is intended as a text for first-yeat 
graduate students who may, however, 


find it heavy going; they will be familiar with the hm 
required, but not with a number of other topies mostly of a topological nature. The authors have Eo 
to make the book self-contained as regards most of these, but even so it will never be light reading. 


detailed bibliography provided, together with a reading list for each chapter, will help to overcome 
some of these difficulties. 


The first few chapters are devoted to a thori 
ment is in terms of that theory this was neci 
Introduction to the Theory of Games will be u 
statistical problems viewed as games between 
followed by a chapter on utility and principle 


ough exposition of games theory; since the whole bum 
essary. Even so, a preliminary reading of McKinsey 
seful. A general description of statistical games (he. 3 
nature and the statistician) rounds off this part. This 
8 of choice which is, unfortunately, too short to provic 8 
in Ramsey-von Neumann terms and of the principles of choice 2 
d Bayes. The reader is not made aware that other ‘reasonable 
criteria exist as well—presumably the authors had to draw a line somewhere. The rest of the boo ie 
those foundations. There are two chapters on fixed sample-size 
s. The final chapters are concerned with theory of estimation 
re most interesting and rewarding. iol 
Surprised to find that his way of making decisions following glass 
the exacting standards set; by the authors, It should be noted b 
i design of experiments or sequential pro 
tical habits is reassuring—which does x 
vious, since a number of new applications follo 
from the new outlook. 


5 i 
but this might be considered s E 
ritis also true that the statistici® 


The book contains many stimulating exercises and is well produced. G. MORTON 


udents of Social Science and Business. By P- $ 
w-Hill Book Co. Inc. 1955. Pp. 392. 41s. 6d. 
Basic Statistical Concepts. B 


y J. K. Apaus. London: 
1955. Pp. 304. 41s. 6d. 


McGraw-Hill Book Co. Ine 


- Nine chapters are devote s 
of measuring location and disper ah 
will be elementary for all. The p E 
bias by discussing the analys! 


his 
research and teaching activities j 


The American Statistician. 


Reviews 491 


time series and index numbers, two branches of statistics favoured by economists and business men. 
At the level at which the author aims the book is reasonably adequate. It is frankly utilitarian and 
conveys nothing of the excitement and interest attendant on the analysis of statistical problems. 
Certain topics might well have been omitted or at least accurately discussed. For example, it is 
questionable whether the statement that maximum-likelihood estimates are unbiased is sensible, 
since often maximum likelihood and bias seem to go hand in hand. 

The reviewer would judge this book as covering much of the work which would be done by a 
first year student in an English University reading statistics ancillary to anothersubject. Suchstudents 
undoubtedly find it easier to think in words rather than symbols and by them this book may be found 
helpful. 

Mr Adams, who writes Basic Statistical Concepts, is an Assistant Professor of Psychology. This book 
is not, however, a combination of factor analysis and ‘stat-rats’ as might possibly be expected, but a 
straightforward exposition of elementary statistics covering, from a less mathematical point of view, 
much the same ground already covered by A. M. Mood, Introduction to the Theory of Statistics (MeGraw- 
Hill). There is no question in the reviewer's mind which book is to be preferred. Given the writer’s 
Objective, however, the exposition is tolerable if uninspired, and may be suitable as a textbook for 
the type of student which he has in mind. It is interesting to note that in our parent subject of 
mathematics every teacher of mathematics does not find it necessary to write a text-book for the use 
of his classes—in fact very few do. Immediately after the war when the spate of statistical text-books 
began it seemed to be a tribute to the virility of our subject that so many persons should feel the 
urge of exposition. Looking back over the decade one realizes that it is probably because the inter- 
pretation of statistical mathematies is essentially subjective, and to each teacher the techniques and 
basic framework mean something slightly different. Nevertheless, the reviewer would suggest that the 
time has come to call a halt. Enough books on statistical methods now exist to suit most points of 
view even if it is necessary to indicate while teaching different emphasis where it appears desirable. 
In ten years we have grown up and what our subject now requires is not more and more verbiage on 
the elements but a series of specialist monographs each on a separate branch, a process which is already 
under way in the design of experiments. Fe Ns DAVID 


Experimental Design and its Statistical Basis. By D. J. Fixxzv. London: Cam- 
bridge University Press. 1955. Pp. xi+169. 30s. 


Experimental Design. Theory and Application. By W. T. FEDERER. New York 
and London: The Macmillan Company. 1956. Pp. 544+47. 77s. 


It is said that Beethoven when young divided his time between the study of counterpoint and the 
composition of music without becoming aware that the one was supposed to be an aid to the other. 
Many applied statisticians will sympathize with him in this. They spend their lives trying to under- 
stand the purpose and background of research programmes, debating the merits of different sets of 
treatments, and assessing the need to make a certain comparison especially aceurate, and usually 
conclude by designing something quite simple in randomized blocks. In books, however, they live 
in a world of hyper-graceo-latin-eubes and super-magic latin squares; while, if Prof. Federer's early 
pages are to be believed, the selection of treatments and of characteristics to be measured are ‘non- 
statistical’. He may be right, but Dr Finney would not agree. 

Dr Finney’s book is at once remarkable and remarkably good. It aims not at teaching statisties 
but at explaining what statisticians do. It discusses experimental design from the point of view of 
logie and common sense, and shows how many scientific arguments can be expressed in numerical 
form. There would be little point in a chapter-to-chapter critique; it will perhaps suffice to say that 
there is now at least one statistical laboratory where biologists asking for instruction will be asked 
first to read this book. They will not understand the chapter on fractional replication and confounding, 
but they should thereafter see statistical methods as a systematization of what they have previously 
done unmethodically, not as an alien culture imposed by a conquering power on the ancient civilization 
of medical and biological science. 

Prof. Federer’s book is different in purpose; it deals with experimental design only in the narrowest 
methodological sense, and within those limits is concerned to be comprehensive and detailed, which it 
certainly is. Also, it has a bibliography that is well-chosen and extensive. Nevertheless, for all its 
merits this is a disconcerting book to read. Thus, on p. 307 there starts a passage about ‘lattices’ and 
‘incomplete blocks’ that the reviewer still found incomprehensible after much rereading, nor was he 


492 Reviews 


helped by finding a table subjoined headed, ‘Incomplete blocks in a randomized complete veo 
Light dawned on p. 314 from a passage, which in any other context would have been a m pesi 
of obscurity, from which it was gathered that the two terms were synonymous, and that gee 
blocks in complete block designs were what most people call lattices. This is ee aba e 
example, but others nearly as bad could be adduced. Nevertheless, on account of its de - pe 
pleteness this book is to be commended. S. C. PEA 


Population Genetics. By Cume Cmon Li. London: Cambridge University Press, for 
University of Chicago Press. 1955. Pp. xi+366. 75s. 


This book is a summary of the mathematical theory of population geneties. It begins with the md 
of large random-mating populations, and the Hardy-Weinberg law. It goes on to the considera! Fi 
of correlations between relatives, and the modifications required for linked and sex-linked gont 1 
for autopolyploids. Various schemes of inbreeding are then discussed, such as selfing and sib-mati s 
Path coefficients are explained, and used in the study of correlation and more complicated peque 
inbreeding. T'he author goes on to discuss the effects of assortative mating, mutation, selection, de. 
division and migration. Finally, there are two chapters on the random fluctuations which are pai) 
able in small populations. In short, most of the subjects of major importance in human and POPE 
tion genetics are included, although in some cases the treatment is disappointingly brief, presuma! i 
in order to keep the size of the book within reasonable limits. But it would be very interesting to hav 1 
more information on, for example, polyploid segregations. The emphasis is entirely on the mathematica 
theory, and only a little is said about the practical analysis of actual data. — 
The explanations are extremely lucid, and anyone with a knowledge of quite elementary statistic 
and genetics should be able to follow most of the book without difficulty (even without calculus). 
ough the proof of the ‘simplified version’ of Fisher 
273) is incorrect as it stands, since Li forgets to include 
n (after selection) and the next generation (at birth). 
ttle confused, and may not take fully into account the 
linkage and assortative mating. And Li does not menton 
he gene-frequency distributions is rather controversial. Bu 
n the whole Li has set out the argument very clearly an 
mathematical models represent an oversimplified version © 


The printing is excellent, and th 
the end. This book can be warmly 
of the genetical structure and evol 


5 : t 
e diagrams are clear and helpful. There is a useful bibliography © 


recommended to anyone interested in the mathematical treatme? 


: E H 
ution of populations. CEDRIC A. B. SMIT 


Statistical Methods (third edition). By FREDERICK C. Mitxs. London: Sir Isaa? 
Pitman and Sons, Ltd. 1955. Pp. xviii-- 842, 50s. 


This third edition of Prof. Mills's well-known B 
account of the more important of recent developments (in statistical theory) that bear on the applic” 
tions of statistics in the social Sciences, in business ini ion, and in governmental affairs + 
It is addressed to the non-mathematical reader. Cone 


The presentation is clear 


Reviews 493 


is intended to deal with the use of statistics in business administration, there is no reference to market 
research; and while random, stratified, systematic and cluster sampling are described, I have found no 
reference, even in condemnation, to quota sampling. One might also have expected, in a work at this 
level, some discussion of the collection of statistical data, the importance of definitions, margins of 
observational error, and similar matters which it is essential to consider when dealing with social, 
economic, business and administrative statistics. 

The perfect text-book of elementary statistics still remains to be written, and perhaps it is an 
impossibility, but the book under review will suit very many students. Their attention is directed to 
the ‘errata sheet’ which precedes the first page of the text; this relates to the biblography on pp. 821— 
30, from which, by an oversight, the numbers were omitted without which references in the text are 
meaningless. FREDERICK BROWN 


The Art of Investment. By A. G. Extrxenr. London: Bowes and Bowes. 1953. 
Pp. 170. 15s. 


Part I of Mr Ellinger’s book briefly sketches the mechanisms of the London Stock Markets, and details 
the recent history of security price-movements. Parts II and III are concerned with investment 
Strategies and decisions, considered as functions of the movements in charts of prices, yields and 
volumes of share trading. These charts are taken as embodying the collective wisdom of investors. 
Since it is unprofitable to invest against market trends, the practical problem is to set up criteria for 
deciding the directions of the main trends. Mr Ellinger’s system is simple, although hedged about 
with rather imprecise qualifications and exceptions. It depends ultimately on the linearity of trends 
in the time series, charted on a logarithmic scale. (The unattractive consequence, that the trends on 
à natural scale are exponential, is not discussed.) 

_ The important question is: does Mr Ellinger’s system work reasonably well? The evidence presented 
18 Suggestive, but far from conclusive. I should like Mr Ellinger to formulate his system categorically 
and exhaustively, and apply it publicly to a random sample of securities selected by someone else. The 
results would be more convincing than any amount of examination of the past. A. STUART 


Measuring Business Changes; a Handbook of Significant Business Indicators. 
By Rrcnanp M. Syypur. New York: John Wiley and Sons, Inc.; London: Chapman 
and Hall, Ltd. 1955. Pp. 382. 64s. 


Mr Snyder's reference book describes and explains about fifty available economie ‘indicators’ for the 
United States economy, ranging over National Income, Population Growth, Labour Statistics, Com- 
modity Prices, Indices of Production and Activity (both general and specifie to some industries), 
Domestic and International Trade, Finance and Stock Markets. It is clearly an indispensable source 
for anyone seeking published information in this field, and gives ground for hope that a similar single 
source book will become available for the United Kingdom. A. STUART 


Statistics (second edition). By L. H. C. TrerzgTT. Oxford University Press. 1956. 
Pp. 224. 7s. 6d. 


This book, already a classic in the *expositions-for-the-layman' group, has been revised and brought 


f the book remain unchanged. 
up to date by the author. The general plan and coverage of the bo LAM. ds, 


Tafeln zum Vergleich zweier Stichproben mittels X-Test und Zeichentest 
(Tables for comparing two samples by X-Test and Sign Test.) By B. L. van DER 
WAERDEN and E. Nirvercett. Berlin: Springer-Verlag. 1956. Pp. 34. DM. 4.80. 

This booklet describes the method of computation and reference to tables of the pseudo-‘t’ order- 

statistic ‘X’, which van der Waerden proposed as a test of difference of central tendency in two sam- 

ples (Math. Ann. (1953), 126, 93). A similar discussion of the sign test analogue of the matched-pairs 
t-test is also given. 


494 Reviews 


Part II (in German) giyes a worked example and Part IV is the English translation of this. ww € 
consists of tables (largely derived more or less directly from standard normal and binomia i. ae 
and Part I gives, in German, a discussion of the relative merits of different tests and the mot on 
computation of the tables. Since the X-test is applicable in exaetly the same circumstances as A rm 
& Whitney's generalization of Wilcoxon's test and since, as van der W aerden notes in his axe grin T 
paper (Proc. K. Acad. Wet. Amst. A (1953), 56, 310), the asymptotic relative efficiency o pan, 
compared with Student's ¢ is 4/(3/7) = 0-977 when there is normal variation, the reviewer at leas 
not convinced of the need for a new test nor of its superiority to Wilcoxon's. z 

D. E. BARTON 


PUBLICATIONS OF THE U.S. DEPARTMENT OF COMMERCE, 
NATIONAL BUREAU OF STANDARDS 


(i) Tables of the error function and its derivative. Applied Mathematics Series 41. 


1954. [Re-issue of the 1941 Mathematical Table No. 8.] Pp. xi+302. $3.25. 
The main table provides values of the functions 


2 fe 2 " 
H(z)- —| eda and H'(x)- e-+ 
vm Jo Jm 
to 15 decimal places, for argument x = 0(0-0001 


) 10000 (0-001) 5-600. A short supplementary table 
gives values of 1—H(x) and H’(x) 


to eight significant figures for x = 4-00 (0-01) 10-00. 


(ii) Table of sine and cosine integrals for ar 
Mathematies Series 32. 
Pp. xv+187. $2.25. 

The main table provides values of the functions 


$uments from 10 to 100. Applied 
1954. [Re-issue of the 1942 Mathematical Table No. 13] 


-" Tsint i 2 cos 

Si(z) 2 | ——dt and Ci (x) = M 
o ¢ o t 

with second differences, to 10 decimal places, 

facilitate interpolation, auxiliary tables are 


There is also a table to 15 decimal places of 


for argument 2 = 10-00 (0:01) 100-00. In addition, f^ 
given of }p(1—p) and łp(1 — p?) for p = 0 (0-001) 1:000- 
multiples of 47, i.e. of Jnz forn = 1 (1) 100. 


(ii) Table of the gamma function for 
Series 34. 1954. Pp. xvi+ 105. $2.00 


This table gives the real and imaginary parts of log, T(z) for z = x+iy, and x y = 0-0 (0-1) 10, each to 
12 decimal places. Auxiliary tables of sin 7x, cos 7, sinh zz and cosh 7x are eave to 15 decimal place? 
(or 15 significant figures) for 2 = 0 (0-1) 10-0 to facilitate extension of the Scope of the table. 


complex arguments. Applied Mathematics 


Reviews 495 


OTHER BOOKS RECEIVED 


Famous Problems and other Monographs. By F. Krzrs et al. New York: Chelsea 
Publishing Company. 1955. Pp. 321. $3.25. 

Income of American People. By H. P. Miter. New York: John Wiley and Sons, 
Inc.; London: Chapman and Hall, Ltd. 1955. Pp. xvi--206. 44s. 


Analysis of Confounded Factorial Experiments in Single Replication. By F. E. 
Bryer et al. North Carolina Agricultural Experiment Station (Technieal Bulletin 
No. 113). 1955. Pp. 64. 

Theoretical Genetics. By R. B. Gorpscuurpr. London: Cambridge University Press, 
for University of California Press. 1956. Pp. 563. 64s. 


Health in Industry (Sickness Absence Statistics). For London Transport Executive by 
Butterworth's Medical Publications. 1956. Pp. 177. 35s. 


Elementary Topology. By D. W. Harz and G. L. SPENCER. New York: John Wiley 
and Sons, Inc.; London: Chapman and Hall, Ltd. 1955. Pp. 303. 56s. 


Rectangular-Polar Conversion Tables. By E. H. Nevius. London: Cambridge 

à 4 y 
University Press, for the Royal Society. 1956. Pp. xxxii+ 109. 30s. 

Irrationalzahlen, 2nd edition. By O. Perron. New York: Chelsea Publishing Co. 1951. 
Pp. 199. $1.50. 

Trigonometrical Series, 2nd edition. By A. Zyamunp. New York: Chelsea Publishing 
Co. 1952. Pp. 329. $1.50. 

Statistics. By W. A. WALLIS and H. V. RoBERTS. Glencoe, Illinois: The Free Press. 
1956. Pp. xii+ 646. $6. 

The Essentials of Educational Statistics. By Francis G. CORNELL. New York: John 
Wiley and Sons, Inc. ; London: Chapman and Hall, Ltd. 1956. Pp. 375. 46s. 

Annual Epidemiological and Vital Statistics 1953. World Health Organization, 
Geneva, Switzerland. (U.K. Sales Agent, H.M.S.O.) 1956. Pp. 571. 50s., $10 or 
Sw. fr. 30. 


[ 496 ] 


CORRIGENDA 


Correlated Random Normal Deviates. 


Tracts for Computers, No. 26. By E. C. 
FELLER, T. Lewis and E. S. Pearson. 
These tables were issued by the Department of Statistics, University College, London, and published 
by the Cambridge University Press in 1955. As an experiment in method of reproduction, the tables 
were not set up in type but were reproduced direct by photo-offset process from sheets prepared at the 
National Physical Laboratory on an electromatic typewriter. It was hoped that this method would 
provide a clear facsimile copy of the original sheets, but it has been found unfortunately that this is not 
the case and a number of errors have appeared in the tables.* These consist of 
(a) missing or broken negative signs, à 
(b) broken figures, 
(c) one incorrect figure and one addition of a negative sign, probably due to printers’ ‘touching up’. 
These errors or opportunities for the misreading of figures are more serious in the present table than 
they would have been in some tables of random numbers or random normal deviates, because on each 
page column totals have been included in the 


g r D form of Ex, Xz*, Y(z)z,) and T(£o, xı). Thus there is a risk 
of inconsistent results arising from certain uses of the figures. 


The 60 pages of published table have now been carefully checked with the original sheets. The errors 
found falling under headings (a) and (c) are listed below in full; for (b) only those cases in which real 
ambiguity seems likely to arise are given. 

A full correction will be made to the figures in the second impression of the Tract which will be issued 
during the coming year. In the meantime, the authors and publishers wish to express their sincere 
regret to all those who have purchased or used the Tract in its present form. T: C.F., T.L., E.S-P* 


(a) For the values listed here the negative sign is completely missing, or less than half there 


Page Column Row Page Column Row 
1 E 24 22 v 30 
1 ES 6* 23 a 34 
3 EN 19* 24 E 46 
4 D 25 28 " 1 
4 EA 47 and 48 30 A 44 
5 Zo 46 32 is 10 
6 v, 47 32 a 5 
s rs 19 35 B. 44* 
; 2% 50 36 es 22 

d 42* 37 ER 29 
1 x, 47% 37 Sa 24 
m ty 15 40 5s 37 
13 E an 44 2, 36* 
4 ta 24 47 25 43* 
Ta 25 52 x, 44 
19 5 46 53 "s 43 
13 ^ 19 54 n 15* 
" 2 10 56 M 19 
Ty 37 56 is 8 
4 fa 45% 56 a 16 
. 25 "p 13 25 " 56 = 31 
E 2 32 56 E 42 
2) zi 10 57 5s 46 
si E 4l 59 Es 15 
a 47 59 : 


"S a, 26 
Negative sign missing altogether, 


Corrigenda 497 


(b) These values all have broken figures which might be misread 


Page Column Row Should read 
5 2s Ex. 49-5068 
12 " 40 —0:37 
16 X Er —5-95 
16 a 19 0-83 
25 a Ir — 6-54 
28 cs 7 153 
34 2. 40 —-1:36 
36 K^ an 10 2-20 
41 = 3s 44 delete mark in front of number 


which might be taken as a 
- minus sign. Read 1:01 
48 2 


1 0-66 
51 d 50 0:37 
60 m 9 t 1:88 
(c) Incorrect addition? " 
6 ay 43° For —0-60 read 0-60 
, 38 n 2009 00 For -063 read -034 
" A " ny 4 n^ hi A "M 
eo ie § 
vue "n * A wr. H 


a, 


J. GANT. ‘Some theorems and sufficiency conditions for the maximum likelihood estimator 
.. of an unknown parameter in a simple Markov chain.’ Biometrika, 42, 342-59. 
Sy. es 
Tho statement in lino 8 below equation (22) on p. 382 is w i i i 
: n (22) on p. 352 is ; eading to (23); 
this result can be obtained more simply as follows" PES Und s. 
Instead of considering the probabilities p(2,, 0) that the ith observation of the variate x be x; write 
p;(0) as the probability that the variate x has the value j, where j is some integer in the given range. 
Let n; be the frequency with which the value j occurs in the sample, so that Xin; = n, the total number 
j 


of observations. Writing the likelihood function as 


; L= Enjnp(0), 
E factorizability condition gives á 


Simla) = Inf(n,, ns, ...) +n F(T, 0), 


where T = T(n, ng ...) is a sufficient estimator of 0. Providing p,(0) and F(T, 0) are piftisoapies 
respect to 0, this differentiation will lead to 


d ð 
EnO) = Xn, Fy (npj0) = zg MFT, 0) = 0,0. 


Where G(T, 0) is some linear function of the frequencies n;. t i 
1 Now let us assume that 7’ is a function formally differentiable in each of the nj, and that G(T, 0) is 
Similarly differentiable in 77; differentiation of G(T, 0) with respect to ny lbadaito 


2AT, 0) oT 


= uO j21,2,...) 
2T an, uO) (Jj 


Since T' is a function of the n; only, this relation shows that 0T'[0n; can only be some function 1/v;(T) 
of TT, so that re 
" an u,(9) vT) apr 1,3, ...). 


498 Corrigenda 


The equality of these products for all j is possible only if 


u(0) = cKO), v(T) = cG 12(T), 
where the c; are constants, and the form of the function G(T, 0) is, there 


f 


fore, 
G(T, 0) = KO) [UDAT ao) = K (0)g(T)+ K0), 


as previously obtained in the paper. The function G(T) is clearly 


êT 
IT) = fanar = S fun Zan, = Sey, 
j n; s 
a linear function of the n;. 4 d $ 


The result (23), where the likelihood fi 


T unction is expressed in terms of individual observations ©; 
instead of the frequencies n,, follows directly. 


I am greatly indebted to Mr E. en for pointing out my error, and to Prof. G. A. Barnard for 
suggesting the proof above. " 
4 1 


JOHN WISHART P nsi 


1898-1956 E ^ 
John Wisl er o 
ohn Wishart, who has been connected with this journal in an editorial capacity sinc oh 
died suddenly in Mexico on 14 July last n 


4 DU "a E » in the 
next number, * An obituary and appreciation will appear 1? 


£u T 


e 


